I would love to see some more advanced algorithms....
Often when objects move, it doesn't grab the whole object, just a part of it. If this part falls within an area you monitor for example "objects inside" it is incorrectly counted.
I suspect each image is compared to its previous for changes?
If the software could learn what is a ~normal image", i.e. if no movement for a certain time, assume that is the "normal image", and then in further images that is taken it compares it for changes. This way you could actually draw a yellow line around the exact object instead of a box?