Against the backdrop of new technological changes and globalization, supported by the boom of mobile network and AI, this study focuses on “card-punching” and “selfie-taking” as typical short-video image practices to explore how city imaging in the new media era reshapes the nexus between human being and the world, thereby broadening the signifcance of communication and media. This study posits that short-video based image practices break the premises of media representationalism. By building embodiness into the texture of cyber cities, these practices evoke powerful forces of image to construct social reality. As a mode of being, they establish the self that exist both in actual and virtual realms: I photograph, therefore I am has become the motto. As an embodied form of media practice, they pool individual tracks from the masses to re-build a public city image: Supporting the claim that we card-punch, therefore the city is.