Augmented Reality (AR) in PythonAR(Augmented Reality) is a system that combines the real and virtual worlds. Have you ever played "Pokemon Go"? If you are not aware of the game, the player has to turn on the camera and roam around places to find a 3D pokemon placed somewhere randomly in the virtual-real world on the console of the app. The player with the highest collection of pokemons wins. The concept that the player is able to see a virtual cartoon character in the real world in the game is AR(Augmented Reality). The game uses location tracking and geographical mapping and runs on AR technology. It creates a view of virtual objects in the real world. There is another technology called "Virtual Reality". AR and VR are not the same. Virtual reality is "virtual world made real" but Augmented reality is combining virtual objects with the real world. VR is about 75 percent virtual but AR is only 25 percent virtual. The difference between AR and VR can be well understood with examples. Playing games with VR headset makes the player involve in a virtual world while playing AR games involves virtual world into real world locations. There are two major goals for AR technology:
This tutorial gives a brief idea of AR using the OpenCV library in Python. Using the available functionality in Python, we'll try to implement some interesting AR ideas. How does a Computer Read an Image?A computer doesn't have the capability of recognizing images by its own. An image is a collection of pixels or picture elements. When we say "My TV's resolution is 1080p", we mean that the height*width ratio of the screen is 3840*2160 which means, the TV screen has over 8 million pixels. A pixel is like a small dot measuring about 0.26mm or 1/96 of an inch (varies). The computer can only recognize numbers. Hence, every pixel is represented using different color models. Most commonly used color models are RGB (Red green blue) and CMYK (Cyan Magenta Yellow Black). RGB model:
Processing Images with OpenCV libraryIn the vast libraries of Python, OpenCV is one of the notable ones. It is a huge open source library that is used a lot for computer vision, machine learning applications. Using this library, we can process images and even videos. To process an image, every pixel is stored as a tuple of (B, G, R) values. There are various functions in OpenCV for processing images. Here are some basic functions: Output: The picture we're working on: Height: 621 Width: 500 Number of channels: 3 No-of blue channels: 12 No-of green channels: 12 No-of red channels: 12 Selecting a region in the image: Resizing the image to half by maintaining the aspect ratio: Rotating the image by 60 degree clockwise: |