You need to differentiate between the finest high frequency detail that a person can see and the finest low frequency detail.
The issue is that our eyes are designed to work by moving our receptors across the image, not holding steady.
Here, I'll quote the opening sentences of chapter 29 from "Principles of Neural Science: Fifth Edition" (Yes, Neuro used this book as a source in one of his posts once, I took him up on it, the book agreed with me; but more importantly it's chock-full of information relevant to the "4K Debate".)
VISION REQUIRES EYE MOVEMENTS. Small eye movements are essential for maintaining the contrast of objects that we are examining. Without these movements the perception of an object rapidly fades to a field of gray, a phenomenon correlated with the decreased response of neurons in area V1 (see chapter 25).
Previous chapters also detail how the receptor cells are excited most by movement.
Given that a cell basically requires that stimulus pass over it, and not linger on it, it follows that when people measure visual acuity using a consistent grid of lines the point at which one cell is going to pass over multiple lines in is going to come much sooner than the point at which it achieves its maximum potential for detecting detail.
I didn't just read this in a book, if you test your own two eyes for high contrast low frequency details, you will find that you can see fine adjustments in shape from quite a distance. My usual test is to draw diagonal lines in an old paint program that does not apply any smoothing, leaving lines nice and jagged ("Jaggies" are the stair stepping effect seen when an angle is drawn on a digital display). As long as the frequency of the jaggies themselves is low enough I can see these relatively fine details out to as far as nine feet on a 100DPI monitor (and sometimes farther depending on how long I look and how tired I am etc... but nine feet is a good number).
Using the exact same monitor, if I look at an image consisting of nothing but alternating black and white single pixel wide columns, the screen goes gray at three feet, just as the angle based calculations predict.
So the commonly quoted numbers are true, if you're looking at a perfectly even grid.
A picture of a person's eyes on the other hand is a perfect example of a shape that will take full advantage of 4K resolution. Actually as far as I can tell it is extremely rare to see something in any given image where details are sufficiently consistent and high frequency to actually blurr out.
As I said, I expect to see blurring on an image of a flat sandy beach, or a large flat concrete surface, but any irregularities will still "pop".
Based on my testing a 4K display as small as 44 inches will still provide extra detail as far as nine feet away.
Actually my primary concern in measuring that was to find the distance where jaggies go away, and an image becomes so detail packed that it should be almost indiscernible from the real thing.
So for my purposes a 44" 4K display should be viewed from
further than nine feet away, and when viewing any closer, or using a display larger than 44", I want 8K.
But I'm a little weird like that, most people don't care if they can see jaggies and would prefer to see all the detail, flaws and all, rather than ensuring that their display out resolves their vision.
I also have slightly better than 20/20 vision so you need to grow the screen a bit to give useful viewing distances for the average person.
So if you have 20/20 vision and definitely want to see all the detail (not out resolve your eyes), you probably want a minimum of 50" when viewing at a maximum of nine feet away, or if my math is correct a 67" screen at twelve feet.