Full frame isn't sharper than APS, that's basically a metaphorical question. The edges of the naturalization of the silicon surface are treated the same, so it's just as easy to get paper cuts from handling a naked APS sensor edge as an FF sensor edge. I know from personal experience...

And this magnification discussion is rather confused, let's just go through the entire image process chain:
1) Reality (in glorious smello-vision 3D!)

2) Optical projection 3D >> 2D
3) We digitize that 2D projection into a "pixel resolution" - in this case via a digital camera sensor
4) this "resolution" is presented - either on screen or on print.
Comparing two different camera formats means that we need to change some stuff to equalize points 2) and 3). We want to keep 1) and 4) constant! The reality (model, object, scene) is the same, we want the end result image to be the same.
We need to change the inter-operation scales. We use lower object magnifications - a shorter focal length - in step 2>3. The image pixel is unitless, just a datapoint in the grid of data that makes a digital image. To keep the quality of the intermediate digital image (3) constant, we need to keep per-pixel sharpness and noise constant.
Using the same lens at the same aperture on both systems won't make that happen. Optical defects spread over more pixels in the smaller sensor - since those pixels are physically smaller. But the object resolution - how much detail
on the target that you can see - can never get worse with smaller pixels, it can only get better. This is the crop effect birders are after. Target resolution.
Using a 1.5x shorter lens at the same aperture value on the smaller system will make the field of view the same, but DoF will be deeper (more about this later!). Now if the lens is also 1.5x sharper, has the same MTF at 45lp/mm as the larger lens has at 30lp/mm, object resolution will be equal - and since a pixel is a pixel is a pixel, digitized image resolution will also be equal. But noise will be stronger (due to the deeper DoF!).
To get noise equal too, you need to keep the amount of captured light energy equal. And to get the light throughput per second equal, we need to have the same entry pupil diameter on both systems. The entry pupil is what "gathers light" from from the scene from the optical system's point of view - if light is to reach the sensor, it has to pass through this aperture. The nominal entry pupil is focal length divided by f/# - that's why it's named an F-stop.
Small "f" for focal length, and the hash-tag "#" sign signifies unitless result. A 50mm F2.0 then has a nominal 50mm/2.0 = 25mm front pupil. To get the same field of view on APS we need a ~35mm lens. We still need a 25mm front pupil, so the F-stop needs to be 35mm(f) / 25mm(d) = 1.4(#) = F1.4
So if you could make a 35F1.4 that's 1.5x sharper than a 50F2.0, everything would be fine. But that's the problem - that's a very hard thing to do. AND you'd need an APS camera with 1.5^2 = 2.25x lower base ISO to keep light energy storage capacity the same, since ISO is area based.