Introduction: Augmented Reality Tutorial: Text Recognition

About: My name is Matthew and I attend the University of Pittsburgh. Currently I am a senior, going for a bachelors in Information Science with a minor in CS. Current interests include augmented reality, virtual real…

This augmented reality tutorial shows you how to make an augmented reality app for beginners. We will be using The Vuforia SDK for in Unity to make a text recognition application similar to Word Lens or Google Translate. Vuforia's text recognition engine allows recognition of up to 100,000 predetermined words. This tutorial is geared towards beginners so we will go through everything step by step.

To get started you will need to download:

Unity 3D: https://store.unity.com/

The Vuforia SDK for Unity: https://developer.vuforia.com/downloads/sdk

Thats all you need, lets get started!

Step 1: Start a New Unity Project.

Start a new Unity project and delete the main camera in the hierarchy.

Drag in the Vuforia SDK into the assets folder.

Go to the developer portal on Vuforia.com and create a developer account.

Go to the developer portal and create a new app license key.

Copy that key to your clipboard.

Go back to Unity and go to the Vuforia folder, prefabs.

Drag the ARCamera prefab into the hierarchy.

Click on the ARCamera and off to the right paste in your app license key.

Step 2: Set Up the Scene.

Now find the TextRecognition prefab in the same folder and drag it onto the ARCamera off to the right, making it a child.

Click the button that says no word list is available. This will take you back to the Vuforia website and you have to download their sample package for Unity.

Once you have that package, drag that into your Unity assets folder.

Now change word list off to the right to Vuforia-English-Word.

Drag the word prefab onto the TextRecognition in the scene, making it a child.

Reset the shader on that game object.

Delete the text on its child.

Step 3: Track Your First Word

On the word game object change the type to: predefined word.

Word to Recognize will appear and type in whatever word you want to track.

Now go back to the TextRecognition game object and change filter mode to White List.

Check on use word prefabs, and put 1 in for max simultaneous word tracking.

Go to additional filter words and type in the word you want to track again.

Now click over from project to console and click play.

Hold up something with the text you want to recognize in front of the camera.

Once your text is detected you will get message on the console that the trackable object has been found.

Step 4: Create the Bounding Box.

Now we are going to create a bounding box around the text so it can be tracked.

Right click on the word game object and add a 2d sprite.

Off to the right click the image on the sprite and add the greenish blue box that Vuforia provides.

Rotate the box 90 degrees on the x axis with the controls on the top right.

Scale the box with those same controls to be the same size of the white box on the word prefab.

Now click play again and you should have a blue box that moves along with your desired text as you move it.

Step 5: Lets Add Some Text.

Part of the inspiration for this was the app Word Lens that turned into the Google Translate App.

So, for the last part lets add some translation-like functionality.

Right click in the blank part of the hierarchy and create a UI Image.

Scale it as you please to cover a large portion of the bottom of the screen.

Change the background color and transparency to whatever you want.

Right click that image and add some UI Text.

For my example I had it say: "Spanish Translation: "

Make the text font as large as you want, keeping in mind that you must also scale the invisible bounding box that houses the text in order to see it completely.

Step 6: Almost Done.

Now we are going to do basically the same thing again next to what we just did.

Right click a blank area in the hierarchy and create a new UI Image.

Scale that appropriately and put it next to what we already made.

Add some UI Text as a child to the new UI image we just created.

Make the text display the Spanish translation of whatever word you are tracking.

Change the image and text colors as you please.

Finally, rename the parent game object (UI Image) that houses the Spanish text to "spanishText" by right clicking on the game object in the hierarchy off to the right.

Now all thats left is to have that appear and disappear as the text is tracked.

Step 7: Actually Done.

All we have to do now is modify some C# code.

Click the word game object and double click the defaultTrackableEventHandler this will pull up the script in MonoDevelop.

Add: "public GameObject spanishText;"

under the class definition and hit control s to save.

Go back to the scene and drag your spanishText game object into the slot that was created under defaultTrackableEventHandler on the word game object.

Now that our game object reference is there we can use it in the script.

In the start function add "spanishText.gameObject.SetActive(false);"

In the OnTrackingFound function add: "spanishText.gameObject.SetActive(true);"

Finally in the OnTrackingLost function add: "spanishText.gameObject.SetActive(false);"

Hit control s to save and control b to build.

Thats it! Now when you hit play the Spanish translation will appear every time your word is being tracked!

Thanks for looking!

Step 8: