Raymundo Ayala: Hello everyone and welcome to our senior project presentation. My name is Raymundo Ayala and I am accompanied by my teammates Nicholas Jordan and Alexis Barcenas. Our senior project is an online learning service for ASL known as Handango. During the presentation you will first introduce our project then we will see the overview of TensorFlow implementation. And we will see a demo of the UI user account features and explanation of the lessons. And lastly, we will conclude this presentation. So why did the team choose handango the number one reason why we chose Handango was because of the limited online ASL Learning Services. For example, the ASL app is a video, categorical, and visual based application to learn ASL, but it lacks ways for the user to practice, which we hope Handango will solve. Another reason why we chose Handango was potentially to provide a different approach in the way to teaching. Many online Language Learning Services, like Duolingo, Utilize entertaining games and quizzes to not only make the learning fun, but to keep the users coming back for more. Now, that style of teaching may keep the user interested, but it does not provide a comprehensive knowledge of the language. Handango will offer comprehensive lectures that will consist of interactive lessons and quizzes, and the user will be able to unlock lessons as they pass the quiz for each lessons. And then we'll also offer a way for the users to practice signing in front of a webcam. Now, who is our target audience? Our target audience is all online learners of all ages. With the current global pandemic, we have seen that the online learning industry has become extremely popular. People are not required to live near university or physically attend. And many people want to learn from the comfort of their own home at their own pace. And if you see at the graph, the online potential market keeps growing just in 2019, its value was about 60 billion and is expected to nearly triple by 2027 to about 170 billion. People who are friends family and co workers of those with the hand disability will benefit from Handango unfortunately, in 2018, there was an estimate of 11 million Americans living with the hand disability, those 11 million Americans all have friends, families, co workers in which they need to communicate with but there is no thorough and comprehensive ASL learning application that exists. Now you may be wondering why people would choose Handango or why they would be interested? What is proven on Duolingo forums, a platform of around 300 million people, users have expressed interest interest in ASL programs. So we can safely assume that 10 million or more people will be coming from that platform. Product features and architecture. Some of the features that Handango consists of is a friendly user interface interface with stylish webpages so that the user can easily navigate through the dashboard, lessons and profile settings. Handango also offers two factor authentication so that users can secure their accounts through generated QR code. Lastly, the most important feature of Handango is the machine learning model, which is a web component for hand gesture validation. Now, I will talk about the architecture of Handango. First we have our Angular front end server which not only serves the assets and webpages to the user, but it also takes care of routing throughout the different pages. And these are some of our back end servers, which include Express.js, node.js, socket.io, and MySQL. The interactions between Angular and the express.js server are responsible for establishing sessions, using middleware to authorize users for giving access to certain web pages, and perform providing user information to be used on the front end. But on its own, the express.js server also performed most interactions with the database that create and alter a user's profile, sends automatic emails to users, and facilitates two factor authentication. Meanwhile, the interactions between Angular and socket.io server only take place during the lessons, lectures, and practice as a way to save the user's lesson progress in their profile as they work they are working on the lesson. And finally, both the Express.js server and the socket.io server interact with the MySQL server to alter the user's profile information and database. Project management. So for the sake of the project, our team chose to go with the scrum software engineering paradigm with small modifications. We went with the with weekly meetings instead of the typical daily standard meetings, and each weekly meeting was to discuss any goals for features to be implemented. And the development of handango throughout the first semester was pretty much just individual work, researching, studying and gathering material needed. And the second semester consisted of team members developing different parts of the project such as the front end, back end, and machine learning. Then it was the sake of getting the lessons together, and lastly, merging everything together. Alexisis Barcenas: Hello, my name is Alexis barcenas. Now I was in charge of implementing the ASL recognition system for our group Handango. In this section, I will be discussing our process in creating our model for our web application. First, here's an overview of the main topics I'll be discussing. I will begin by talking about the two main differences between a model from scratch and another thing known as transfer learning, and the reason why I chose one over the other. Next, I'll talk about data preparation and data processing. I'll discuss how I organize my directories and what operations we performed on the data to make it feasible to our model. Next item is the model training process. I will discuss how we created the model using the layers API within TensorFlow.js. Lastly, we'll discuss the prediction process and what we did to send data from the client to the server and back to the client with our prediction result. Let's begin with model from scratch versus transfer learning. First, let me better describe what transfer learning is. Transfer learning is the process of training a segment of an already established model and tailoring it to your liking. TensorFlow models have something known as layers, which I will discuss in more detail in the model section, we find a point in the model and truncate it, and then simply add a few new trainable layers from that cut off point and begin training it to our needs. The benefits of transfer learning include smaller amounts of data needed, faster training speeds, and more accurate results. Let me share some of my experiences with the benefits of transfer learning. I first tried making a model from scratch, I tried with a small, a small sample of 50 images per training room. Even with 50 images and several classes it took around 30 minutes to train this model. It was also not very accurate. I tried to give more images; although this did get me more accurate results, it also took a lot more time, it would take almost two hours to train the model. And while it was more accurate, it wasn't worth the time wasted to train this model. With transfer learning however, I can use that smaller size to train a model good accuracy. A run with fifty images per class was enough to produce good results and transfer learning. The train time was also immensely faster. It would only take around 15 seconds to train a model using transfer learning and created a more accurate model able to handle more than two classes. It was a good idea to look into transfer learning for our web application. Next, let's talk about data preparation and data processing. First, let me explain my directory setup that I had in my training folder, I had a main directory called data that contain many sub folders of different classes. images in the right hand corner shows some of the directories I had. And as you can see, each directory has a unique name. To collect these images, I use the Jupyter Notebook Python script, I set up to capture frames with my webcam every two seconds, and the library I use for this was open CV. This script eased the burden of trying to find a good and thorough image set online. Now let's talk about the processing and the operations we need to perform on our data. Reading our images, I simply use Node JS file system package and a function to read in each desired directory. I had a for each loop to form what are the following operations on each of the images. When we have the image buffer, we can use a simple TensorFlow function to convert it to a tensor. Next, for each image, we need to resize it. The reason is because our pre trained model only accepts the input shape 1, 224 , 224, 3. We use it resize by linear TensorFlow function to resize the tensor. After this operation, we get a tensor shape of 224, 224, 3. Very close to our desired input shape, the three represents the color image RGB channels of the image. Then, you have to expand the dimensions to get our desired input shape. a function called expand DIMMs allows us to add an extra dimension to the tensor. After this, we have our desired input shape of 1, 224, 224, and 3. Next, we simply convert the values, the images to floats, and then normalize the image data. We perform tensor division of 127 and subtract 1 on each tensor to normalize the values between negative one and one. The reason we divide by 127 is because the images RGB values range from zero to 255. Finally, we just need to push our data into new tensors. We have an x train and y train that will hold our tensor's data. Our x train holds all of our image tensors and our y train holds our class encoder. Our encoding system is one hot encoding. We push a one hot tensor that encodes the images class to the y train tensor. A one hot tensor helps label our data set according to the index of the class. Example is imagine a binary classification system with the ASL for A and B, the index of A zero and the index of B is one. One hot encoding for the V label would be 01. This represents an image of B because its index is one. So value one would be in the index of one, as you can see on the right hand side of the screen. In this section, I will discuss the model training process. The Pre train model we chose is called mobile net. When there was created to be a lightweight and portable model, according to the creators. They want applications to run smoothly on less powerful systems like mobile phones and embedded systems. This is good for our project because it would be enable more people to get access to Handango, they'll be able to work due to local requirements for computation power of the device. Furthermore, mobile net's architecture includes 28 layers in it. It's a good model for transfer learning because it was trained on 1000 classes and over a million images of a myriad of nouns in the English language. This diverse mix of objects makes it suitable for transfer learning. Now let's talk more about mobile net architecture. Layers are the building blocks of a model in TensorFlow.js. They're connected together and built on top of one another until we reach a softmax activation as our output layer for classification purposes. The first layer mobile net is the only full convolution layer in its architecture. Its mobile is supposed to be fast and portable. We only use this type of layer once. mobile net is mainly built with a special type of convolution method known as depthwise separable convolution. The benefits of this approach is that it uses eight to nine times less computation per hour than the standard convolution method, and only sacrifices a bit of accuracy for this efficiency. Moreover, this convolution method consists of depthwise convolution and then a point wise convolution. The reason for this is because even though depthwise convolution is efficient, it cannot combine input channels of the layers. This is the purpose of the pointwise convolution. It combines the new features we're training but they work together to perform this operation. After each layer, there's a batch norm and a rectified linear activation function that is performed. ReLU is used because it makes the model easier to train and can even offer better performance, and a batch norm is a method to make neural networks faster and more stable through the normalization of the layers inputs. Now, let's get into the model creation process. The first step is to load the base mobile net model from tensor flows Google Storage API. After loading the model, we use a function to grab a layer for mobile net and call this our cut off point. In our case, this is convolution point wise 13 ReLU. It's a good decision to usually choose something that is in close proximity of the output layer. We will not be using all 28 layers of mobile net because we want to tailor it to our tasks, not the original tasks. Flares before the cutoff point are known as frozen layers because they remain fixed and not trained on. However, they do contain knowledge of the features of the data sets that they were trained on. And this will be helpful when training our own model. Finally, we defined a variable that would be set as the truncated model, we set the mobile net as the input layers, and then our desired cutoff point as the output layer and then we have completed our truncated model. Then we create our final model. At minimum we only need to add three layers, but you could possibly add more if needed. First, we need a flat layer that's input shape is the same as the output of the truncated model. You can then add N amount of dense layers depending on your needs with a ReLU activation. Finally, you could add your final dense layer, this final layer must have the number of classes you desire in your model as a unit parameter in that layer, and it must have a softmax activation for the purpose of classification. The softmax activation is often used in the final layer of a model, because it normalizes the output of the model to a probability distribution over the predicted output classes. In our case, this is our ASL letters classification. We then need to compile the model using categorical cross entropy, atom optimizer and track the accuracy metric of the model. Categorical cross entropy computes the loss between labels and predictions. The Atom optimizer is an algorithm that is a stochastic gradient descent method that is based on the adaptive estimation of first order and second order moments. And accuracy is simply how often do predictions equal labels. The final step now is to fit the model. We pass in our tensor images, x train, our label tensor of one hot encoding, y train, and then we pass in our hyper parameters. You pass in the batch size, which is the number of training examples using one iteration. And then we pass in epochs, which is the amount of times the entire dataset goes forward and backwords through the neural network. After this is done, we finalize our directories of classes and run the training code, and then wait for the train to complete. Now let's talk about the machine learning process. ending up. Here we have a model loaded. To start the camera, I simply click the green button to stop the camera. Click the red button, let me demonstrate this now. Start your camera, press again to stop the camera, then send predictions. Start your camera, then click the blue button to send data to the server To send data simply click the snap button. Here is your sign and then take a prediction. As you can see I just did the A sign. There'll be the info will be displayed below. You get your prediction result. And then a probability of the certainty of that sign. Let's try doing a different sign this time, let's do B. We did B, we got 85.63% chance that the sign is B. Let's try one more. To demonstrate the process. I just did the F sign. I got 89.56% chance. So this is the machine learning process of Handango. Let's just stop the camera and head back to the dashboard. This concludes the machine learning aspect of Handango. Thank you for listening. Nicholas Jordan: Hello, my name is Nicholas Jordan and I'm going to guide you through all the UI that Raymundo implemented using Angular and bootstrap as well as the account features that I implemented using a Node.js and MySQL back end. So prior to logging in or creating an account, the user is greeted by the home page, where they are able to learn a little bit more about Handango, such as how lessons are assigned, and the machine learning component. And in the About page a little bit more about how the lessons are tracked and the progress is stored in the user account. But more of this stuff is covered by Alexis. I'm here more to cover some of the account features that are implemented starting with the three different login types. So we have an email and password, Facebook and Google. So choosing to log in using Facebook or Google will redirect you to their respective webpage to handle external signins. And once you give them your login information, our back end receives enough information to create an account for you. The implementation for this was relatively simple using the passport node package, which can handle external logins from many different websites from Facebook and Google, to GitHub and LinkedIn. So creating an account this way, is simple for the user and the back end, since none of the user information needs to be verified, since it's already been verified by Facebook or Google. So registering an email and password account is a little bit more involved. You have to do more things on your own, your username, first and last name, email, and passwords. But before we can log in, we need to verify that the email actually exists by clicking on the link sent by the automatic email service that was implemented using the node mailer node package. Here, I'll test that out here. Now you see, my account is not verified, so I need to go to my email. Verify using this link. Now we're fine. The link contains a long URL safe string that is stored in the user's account along with a timestamp in the database. This string is the key to verifying the email where if the string doesn't match what is stored in the user's account, or if the link expires after an hour, the account is not verified, it is automatically deleted. And logging in now that I've verified works and I'm at the dashboard. Let's say then that you forgot some of your accounts login information you can't log in, you don't need to contact support just yet. If you forgot your email, you just need to give us your account's username, and the email account that's tied to that username will be sent an automatic email, reminding you which email to use to log into your account. If you forgot your password, then you just need to give us your accounts email, and in an automatic email will be sent to the account. With the link to change the password, this link contains another long URL safe string and upon clicking, that string gets hidden from the user, and the user just needs to put in a new password. Let's put a new password in. And the password gets changed. Let's try logging in with it. And it works. However, just the act of changing your password is pretty suspicious and is a security concern. So another automatic email is sent to the account. With the link that expires in an hour to disable the account. This also contains a long URL safe string. Now the account is disabled. Let's try to log in. And you can see that the account is disabled. And now no one can log in. So you should probably contact support and they'll get back to you when they feel like it. So now that my account is disabled, let's go ahead and create a new account using Google. So I don't have to verify anything and we'll be greeted by the dashboard, which is where the lessons are made available. But Alexis will go into more detail about this. We're headed to the Settings page where we'll first view some of the account information creation date, last login, login type, First Name, Last Name, username and email. Going on, we see security. And normally for an email and password account, there would be a form to change your password. But we're a Google account. So we can only change to Enable two factor authentication, which will go ahead and do. Something you'll notice is that this two factor authentication implementation is dependent on Google Authenticator as well as a node package called speakeasy to generate the QR code, and secret which QR code and secret down here for the user to scan and put into their authenticator app, which I'll now do. So after scanning the QR code with my authenticator app, I can now get a six digit code, and input it into this form. And now I have two factor authentication enabled, I'll be automatically logged out. And now when I try to log in, I'll be asked to put in a code, and it's changed now. So I'll input it. And now I can log in. So the implementation for this is particularly, this method is particularly insecure, compared to an SMS based solution. But unfortunately, automatically sending text messages, requires the use of costly API's, which we do not have the money for. And a homemade solution, which I've tried is convoluted and not well supported. So this works just fine as a proof of concept. And was simple enough to implement. So now let's change my Google account to a Facebook account log in. Down here, we'll click this change switch to Facebook account. And I'll be asked for two factor authentication again. Now we're set, we're going to be switched to a Facebook account, er yea Facebook account. So if I try to go to switch to a login using my Google account, it won't work. It'll tell us that's an invalid method for my account. So I log in through Facebook. And two factor authentication. Oops. Go back to the settings. We'll see that I switched to a Facebook account and go back to the settings down here, we'll see that I can't switch to a Facebook account since I am a Facebook account. But I can switch back to a Google account and I can switch to an email account. And I'll go ahead and switch to an email account to show that we'll go to test. Give it a password, since our account right now does not have a password. So we need to store password. And I'll say we'll change to an account, an email account. So let's try to log into that We are now verified. So similar to being to registering as an email password account, switching this email, switching to an email password account, you again need to verify your email. So just click the link. And we're verified. We can go ahead and login and two factor. So now that I'm verified, you can see now that I have an email and password account, you can see my username is different. That's different. Going to security, you can now see the form that I mentioned to change the password. And I'll go ahead and do that. Here's my current password Here is a new password. It's one one character longer We'll change password, two factor authentication Changed on me That will go through password change successful and I am being logged out. And we'll again get the email to disable the account but we won't do that this time. Since you guys know how that works. New Password, two factor lot's of two factor and now we are in. So that's done. And so the last thing that last thing to change in an account is to change it to a new email. So do that by getting a current email with this account. new email repeating or changing email two factor. That's wrong. There we go. So email changed. So in this case, we're going to get the old email account will get a link saying the email has changed, I can click this link to disable the account. This is our old email, and the new one will have a link to verify. So I can verify. Let's log in to the new account. We can see that it works. Give a two factor. And we're in with our new account Gmail settings, you can see that it's changed completely. So that about wraps up all of the account features where we got to see the functionality between the back end code that I wrote, and some of the front end code that Raymundo wrote I am currently at my dashboard. Alexisis Barcenas: Before we head into the demo, let me give an explanation on how this module works. We currently have three full lessons implemented each with a lecture, practice and a quiz. All lectures are adapted from Irene Dukes everything sign language book, and are just static web pages, where practices and quizzes are dynamic, in that they are built on the fly using the use of information from the database. practices and quizzes are both comprised of mini games, but they differ greatly in how they are handled in the back end. But before we get into that, let me first explain how the mini games are stored. Each mini game has its own table that stores an ID and a MySQL JSON data type. using JSON makes it easy to organize multiple phrases or questions into one mini game instance. That also makes it easy for the front end to parse. As a proof of concept, each mini game is loaded with only 25 different sets of these three phrases or questions. For practices a new lesson is unlocked to user several many to many relationships between the user practice table and each minigames table game three entries. Having this relationship allows the users answers to the minigames to be saved, so that the user can leave and come back but their answers remaining the same. Quizzes on the other hand, are much less involved. When the user goes to quiz page, each mini game assigns three random sets of phrases or questions to the user. So each time the pages visited or refreshed, a new quiz is generated. Quizzes are also timed to be 10 minutes long before automatically being turned in and graded. If the grad is greater than or equal to 70% then the user passed and unlocks the next lesson. Now let's head into lesson where we will learn about the alphabet and fingerspelling in ASL. Once you have navigated to the page you should be introduced to finger spelling to move through the lessons, you click on the left hand side or the right hand side, to go sequentially click on the right hand side. You will be then introduced to use cases for fingerspelling. Tips on how to become an accurate finger speller certain conditions people have and how to solve them. warm ups instructions for each letter. When you click on a letter, you'll be given the sign. You also be given a text description on how to perform the sign Here is letter B Here is letter C. And here letter D. You'll then be given other use cases for fingerspelling, such as abbreviations. Once you see the fingerspelling page again, you can continue on to the practice section, or return to the dashboard, but let's go to the practice section. Here's one of the mini games we have. You are given the alphabet and basically have to plug it in to spell out the word you just drag and drop and it fits into the square. Here's another mini game. This is a translation game. You may see given three words you have to translate each word and type it into the box. Here we have multiple choice questions. As you read the question, answer them. Here is the machine learning aspect of the application. Here you are simply given three signs and you match them accordingly. Drag and drop, drag and drop, Drag and drop. These are all the mini games in lesson one. To see how well you did on the practice section, you can click on the take quiz button. You'll see with my random answer, it seems I got six correct answers. Then you continue on to the quiz. Based return to the lecture section. Let's head back to the dashboard. What's also neat about this, if you go back to the practice, your progress is saved. I can see the two randomly dropped items are still there. Let's head back to the dashboard. Now let's head to lesson two, we'll learn about numbers and counting in ASL a little bit faster. In this section. We introduced the numbers and counting. use cases for numbers, counting to 10. Use assignment number, here's some examples. Same as the letter section and shapes as descriptors more numbers here. Counting by 10s to 90 big numbers where you use Roman numerals fractions. conversations you can have with letters and numbers. That concludes this lecture. Let's head on to the practice. Here we have a matching game. What's different about the number section is we have gifs for numbers. Here is dragging them so you can see. Another matching game here out of these options try and find the one that fits the answer. And here we have a translation game. Play the GIF. translate the number and type it in here. More multiple choice questions. This concludes this practice section. Remember, you check how many answers you got correct by checking here. Let's head back to the dashboard. Like the last one, your progress is saved. Now lets head into lesson three where we will learn about types of questions in ASL. Use these questions in ASL. The usual two types of questions in ASL. Wh questions Like who? What and when? Yes, no questions, interview questions. phrases you can use in an interview rhetorical questions, then you can navigate to the practice. In lesson three, we have videos you could identify. You need to identify the phrase and then translate the word into the textbox We have multiple choice questions and then you're given a word here and you select the correct video. Here you need identify the word for the video Drag and drop to answer the question. This is everything in this practice section of lesson three. Let's head back to the dashboard. Like before everything is saved. I'm now in a new account, I'm going to show you how to unlock a lesson. To unlock a lesson you need to at least get to 70% on the quiz. Let's go to quiz. As see here you have a 10 minute timer to take the quiz. I'm going to take the quiz. And when I finish it I'll show you what happens afterwards. As you see after you terminate the quiz, the second lesson becomes available to the user. Raymundo Ayala: To conclude this presentation I will be talking about what each team member learned from completing the project and any future plans for it. We have already introduced and demoed the project for you guys, so let's jump right into it. First we have Nicholas Jordan, he was able to become more familiar and comfortable with Node.js when developing the back end servers. He was then able to implement it to secure user accounts that can be updated and altered smoothly. Then he was able to get a better understanding of client to server communication through HTTP and websocket connections in order to provide Handango with user authentication and live user account updates. Finally, during the front end and back end merging, he was able to familiarize himself with how Angular works and how to appropriately refactor the back end responses to work best with the front end. Next we have Alexis Barcenas. He was able to learn a lot about the model creation process using TensorFlow.js on Node.js. He then became familar with data preparation and data processing images to feed into the TensorFlow model. Lastly he was able to learn the major differences between creating a model entirely from scratch or using a pretrained model as the base. With this knowledge he was able to implement an ASL classification model that can make predictions of people's hand gestures. Lastly we have Raymundo Ayala, or myself. I was able to familiarize myself more with Angular, Bootstrap and its various utilities and components that it offers to not only improve my web development skills, but to be able to deliver user friendly applications and web sites. Another main tool that I think was very useful, and I learned during the project was being able to collaberate with a team and overcome any challenges by sharing ideas and source code through a distributer version control platform such as GitHub. Future project plans, so as we see there is major room for improvement for Handango. The most clear improvement would be offering more comprehensive lectures which would consist of more lessons, practices, and quizzes. Another huge improvement would be to expand the machine learning model in order for it to recognize and validate more ASL phrases. As of right now we do not have a plan to continue working on the project, but we all feel proud of the work we completed by applying our knowledge of several computer science courses. And these are some of the sources that helped us develop Handango from the market research, ASL info and assets. Thank you all very much. Transcribed by https://otter.ai Edited by Nicholas Jordan