GiantMIDI-Piano is a large-scale symbolic classical piano music dataset built by applying the piano_transcription system on a vast collection of piano performance recordings. The dataset contains thousands of piano works, spanning a large number of composers and styles, with each piece transcribed into high-precision MIDI files capturing note events, pedal usage, velocities, etc. It provides a resource for music information retrieval (MIR), symbolic music modeling, composer classification, music generation, analysis of classical piano repertoire, and data-driven research in musicology or AI-based composition. Because the dataset is machine-generated via an automated transcription pipeline, it offers consistency, scale, and accessibility that would be difficult to achieve manually — enabling researchers to work with large corpora of piano music without copyright restrictions on symbolic data.