This question is totally relevant, and mathematically justified. A camera's sensor at the hardware level is analog. If it records a total dynamic range of 14 stops (let's say) and this data is converted to a 14-bit digital signal, then one could say that 2^14 = 16,384 tones are being recorded per channel, ignoring noise.
If the camera's sensor records a total dynamic range of 10 stops (let's say), and this data is also converted to a 14-bit digital signal, then there are still 16,384 tones being recorded per color channel, ignoring noise.
However, those tones are representing a range of bright to dark which has 4 stops (16 times) less variation in tone. Ignoring noise, tones are recorded with 4 stops (16 times) as much sensitivity to tiny shifts in color/contrast. This is true only for tones within that limit of 10 stops of dynamic range. Tones outside that range are lost, which is the drawback.
A camera sensor X could be designed with the same amount of noise as a camera sensor Y, but a lower dynamic range for X and a higher dynamic range for Y. If the signal were accurately converted to the same 14-bit RAW digital output, then under the given assumption that both sensors had the same amount of noise, then there would always be better gradation between tones in the output from camera sensor X, but better resistance to blown highlights / lost shadows in the output from camera sensor Y.
Nothing in the world can increase without a trade-off, and that is definitely true for dynamic range as well. If all other factors are held constant, increasing dynamic range has the drawback of decreasing gradation in tone. In the limiting case of infinite dynamic range, then all levels of signal would be rendered as a single flat tone, just like a delta "spike" function in Fourier analysis corresponds to a signal of all wavelengths.