Is it best to crop other people out of training videos for a Create ML activity classifier?

My activity classifier is used in tennis sessions, where there are necessarily multiple people on the court. There is also a decent chance other courts' players will be in the shot, depending on the angle and lens.

For my training data, would it be best to crop out adjacent courts?

Is it best to crop other people out of training videos for a Create ML activity classifier?