Paul has given you some very nice advice. Having used DSLRs to record live events, I can say that it can be done. However, I think you should definitely consider Paul's advice to consider the XF100 (or the more expensive XF300) series.
My assumption is that you will have camera operators and that the you would prefer an editing workflow that is relatively straight-forward.
I have the Canon 5D Mark III and have used it for live event recording. It will record 29 mins continuously (and as Paul said, across mutliple files, but without any gaps). After 29 minutes, the recording will stop and then automatically start back up. The pause between those two clips is about 5 seconds in my experience. Magic Lantern installed on a 550D/T2i or 600D/T3i (or 60D or 5D Mark II) can be set to restart recording after those camera's 11 minute segments. The gaps between clips for ML seem shorter to me than the 5D3 (and, presumably the 650D/T4i's). If target is TV, I don't recommed cutting the bitrate via ML to lengthen the recording time. It takes planning to make sure that these gaps do not overlap.
You could mix-and-match an XF100 or XF300 with 3 DSLRs. I've done this using a Sony PMW-EX1, a 5D3, and 600D/T3i. But, this makes post-production quite a challenge. The nice thing is that you have a continuous A-camera, with no drops in sound, off of which you can sync the other cameras. The downside is having to color-correct in post-production to make the cameras match. In the church's case, assuming limited windows in the church, the lighting should be constant from week to week. Thus, some color presets could be set after a lot of testing and these would presumably not stray to much. The only main post work is to sync and then edit (my recommended workflow).
Set far enough back and framed with enough zoom, you should have a fairly shallow depth of field with a good video camera such as the XF100 or XF300 (or the Sony PMW-200), provided there is some distance between the subject and the video screen, as Paul mentions. For other cameras, such as one focused on the congregation, you don't need (and perhaps don't want) shallow depth of field for the projection screen reason anyway. Same might hold if you place the other B cameras at angles that exclude the projection screen (e.g., to the side of the speaker, one on the choir, if any).
I love DSLR's for the shallow depth of field and look they get. But, I only shoot live events with them (along with my EX1 as the primary camera) because I can't afford additional proper video camera just for live events - and I don't shoot that many live events. In a way, I pay for this in post-production time. Many wedding videographers use DSLRs. But, they might only have a couple of cameras for an event that is relatively short and the artistic quality or look matters a lot to them. And, they have a lot of skill in shooting with them - and probably would not dream of relying on the DSLR's autofocus. Add in autoexposure and the good sound you will get (without spending extra money) with proper video cameras. I wonder how much skill you can expect from 4 camera operators, week in and week out?
Finally, as has been documented and discussed here, DSLR's do not have full HD resolution. This might or might not matter to you, depending on whether you are delivering HD for broadcast. But, at some point in the next few years, it might matter to someone. The Sony PMW-EX1 that I spent a fortune (to me) on 5 years back is still delivering a nice picture in good light and, as mentioned above is still a capable workhorse for its intended use.
In the end, I would not rate this situation as one in which a DSLR is the right tool. Again, it can be done as I've laid out above. But, I'm not sure you're really setting things up for the best result. If the budget and other considerations dictate, then so be it. But, having 4 identical cameras all with the same profiles and white balance and continuous recording would make delivering a program for broadcast each week much more efficient and, probably, with better quality.