Duolingo is a platform for language learning

My Take on Some ML Interview Questions - P1 #ai #inprogress¶

Chip Huyen is one of my favourite authors in the space of MLOps. She has some great blogs, and a really useful book. In one of them, she asks the following questions. This blog post is my answer to the ones I felt I could contribute interesting points to. Since there are quite a few, I will probably split them up into parts.

Note: These are my views on these questions. They are not a comprehensive resource by any means. Just me thinking out loud on how I would go about solving the problem. Like any research project, as more time passes, these answers might change. They are here as a way for someone starting out to get a feel for approaching problems posed to them.

Duolingo is a platform for language learning. When a student is learning a new language, Duolingo wants to recommend increasingly difficult stories to read.
1. How would you measure the difficulty level of a story?
2. Given a story, how would you edit it to make it easier or more difficult?
You run an e-commerce website. Sometimes, users want to buy an item that is no longer available. Build a recommendation system to suggest replacement items.
When you enter a search query on Google, you're shown a list of related searches. How would you generate a list of related searches for each query?
Each question on Quora often gets many different answers. How do you create a model that ranks all these answers? How computationally intensive is this model?
How to you build a system to display top 10 results when a user searches for rental listings in a certain location on Airbnb?
When you type a question on StackOverflow, you're shown a list of similar questions to make sure that your question hasn't been asked before. How do you build such a system?
On social networks like Facebook, users can choose to list their high schools. Can you estimate what percentage of high schools listed on Facebook are real? How do we find out, and deploy at scale, a way of finding invalid schools?
How would you build a trigger word detection algorithm to spot the word "activate" in a 10 second long audio clip?
If you were to build a Netflix clone, how would you build a system that predicts when a user stops watching a TV show, whether they are tired of that show or they're just taking a break?
Facebook would like to develop a way to estimate the month and day of people's birthdays, regardless of whether people give us that information directly. What methods would you propose, and data would you use, to help with that task?
Imagine you were working on iPhone. Everytime users open their phones, you want to suggest one app they are most likely to open first with 90% accuracy. How would you do that?
How would you train a model to predict whether the word "jaguar" in a sentence refers to the animal or the car?
How would you create a model to recognize whether an image is a triangle, a circle, or a square?
Given only CIFAR-10 dataset, how to build a model to recognize if an image is in the 10 classes of CIFAR-10 or not?