Most electronic shutters up until recently were progressive shutters...the sensor would be read top down or bottom up row by row. That's relatively fast, however not fast enough to prevent exposure gradients from occurring, where the rows read later are brighter than rows read earlier.
To guarantee that no light continues to expose the sensor during readout, a mechanical shutter is used to block light. That's the primary reason.
Today, more advanced electronic shutter technology called a "global shutter" is becoming more prevalent. A global shutter does a simultaneous dump of all pixels into a per-pixel backing buffer or "memory". This allows the exposure on the image signal to-be-read to stop. That buffered exposure is then read out, row by row. When the next frame of an exposure is ready (in either video or continuous stills), the sensor pixels and the backing buffer are reset to zero charge, and the new exposure starts.
Global electronic shutters are more expensive, as they require more logic per pixel. They need the necessary charge transfer logic and buffer memory, which requires more space. When you factor in shared pixel architecture, it gets more complicated. The transfer of pixel data into the buffer memory does take some time as well, and there is still a per-row activation required to initiate the transfer (although it can happen much faster than a full row readout), so use of a global shutter can still impose minor limits on frame rate when we get into the kinds of pixel counts we have for still photography (at least at price points that are acceptable to most photographers...when cost is no longer an object, you can achieve high quality IQ at exceptionally high frame rates...but it definitely increases the cost of the shutter.)
At some point, I figure electronic first curtain shutter will become the norm for DSLRs. Eventually, the shutter should eventually be dropped entirely for a full global electronic shutter, even if the mirror remains. I don't foresee Canon doing anything with global shutter until they reduce their transistor size. At 500nm, adding a global shutter would decimate their IQ. At 180nm, they would definitely be more capable of adding global shutter technology, but it still takes up die space and reduces fill factor a bit.