📢 Kolaambi #3: 🤫 Shhh! The machine is learning.

A bi-weekly newsletter with all things tech, people and communities from TinkerHub Foundation.

and

Dec 03, 2020

“To understand a language is to understand thoughts.” ― Manali Tiwari

നമസ്കാരം. 🙏

In our last TownHall, we asked our community folks a question —
Who will people trust MORE with their babies in 2030: robots or human baby-sitters?
Here’s the response:

Are you really surprised though? Machines are getting smarter every single day. At TinkerHub, we focus a lot on helping hoomans to keep up the pace and learn faster and better. So that the world does not get taken over by Terminators of course! 😲

A word of wise to those starting out on the journey though —Do not give up quickly. Be consistent and get good at coding and algorithms before jumping head first into machine learning. Trust us, you’ll thank us later.

What to Learn 👩‍💻

How to get started? Here’s a little roadmap from @GKS to get started on ML. It pretty much starts from the basics! 10/10 recommended.

Here’s another one from FreeCodeCamp🔥 with even more details on the techniques and tools, as well as a few sample data sets that you can play around with!
Web scraping is a technique to fetch data from websites by writing scripts. Finding good datasets remain paramount for training your ML models.
Here’s a fun project to scrape news articles and create word clouds using Python and BeautifulSoup (no, it’s not the dish! 😁)

Pinned 📌

Last day Adhil asked us, “To get an ML intern role what kind of projects do they expect? 🤔 ”. We asked a couple of experts in the field what skills they look for.

Here’s what Sleeba Paul, ML Engineer and the co-creator of Auria Kathi, the first AI poet-artist said.

Basic computer science and software engineering knowledge.
Data structures, algorithms, computer networks, operating systems, git, IDEs, etc.
It is good to have a software engineer with core competency in ML rather than an ML engineer in the team. For large teams, these criteria need not be met.
One or two well documented GitHub open source repositories related to ML.
Good documentation is an indication of effective communication skills and in-depth knowledge of the technicalities of the project.
Able to explain their ML related projects to someone.
What were the challenges in the project and what did they do?
What were the problems they couldn't solve?
What were the problems that were solved?
Why a particular algorithm was used?
What was the validation scheme?
What is their role in the project if it was a collaborated project?
Theoretical knowledge.
Don't have to learn every algorithm in ML
Have in-depth knowledge of algorithms they've used in their projects.
Good to have some MOOCs done related to ML.
Shows the passion for continuous learning.
5. Cultural fitness
Humility and the attitude towards working in a team.
Jerks, even though they are damn good at their work, are a big liability for a team
This is an ideal candidate. Things vary according to circumstances.

If any of you did end up bagging an ML Intern role, reach out and tell us how you did it. We’d love to feature you.🤩

What to Watch 📽

🎧 Podcast: Here’s a good one giving a walkthrough on the sentiment analysis done on the social media posts about the reactions to facemasks across different demographics. It mentions how the analysis was done step by step and how the observations were made.

▶️ Here’s an entire list of AI and ML channels on YouTube. Our favourite is the one from Andrew Ng.

What to read 📚

🔖 Meet Gopikrishnan Sasikumar (again!), a full-time ML engineer at FullContact. Read his journey here on getting started and solving problems.

Image for post — Machine Learning session for PhD scholars

🔖 Here’s a list of more websites to read interesting articles on ML.
Don’t get caught up in just reading though! Hands-on all the way!
Pratham Prasoon @PrasoonPratham
Best websites for machine learning: - Towards DataScience - Kdnuggets - DataScience Central - Hunch .net - SimplyStatistics - Fastml What would you like to add?
12:07 PM ∙ Nov 21, 2020
1,838Likes333Retweets

🔖 Here’s an interesting infographic on all the different fields that comprise of AI. The practical applications are ever-increasing day by day. Choose what interests you and start with identifying the problem statements in a domain and relevant datasets.
Artificial Intelligence & Machine Learning – CivicSpace.tech

⚖️ AI & Ethics

Is it all about the killer robots?

It’s not just killer robots or machines outperforming humans resulting in job loss that should be worrying us. While AI has been gaining popularity worldwide for its contributions across almost all the domains including and not limited to healthcare, telecommunications, education, security, energy, law enforcement and many more, one can’t just ignore its darker sides. It’s a fine line separating the use of this technology for what could be determined as ethical generally and otherwise. One can’t ignore the risks that come along with it.
Remember deepfakes?
Imagine the hidden unconscious biases that get deployed on large scale in the ML models that could prosecute you incorrectly or deny you admission to a coveted university because of your race or gender? 🤔

Investigative news site ProPublica has found, a criminal justice algorithm used in Broward Country, Florida, mislabeled African-American defendants as “high risk” at nearly twice the rate it mislabeled white defendants. Other research has found that training natural language processing models on news articles can lead them to exhibit gender stereotypes. - Harvard Business Review

We’ve also seen cases of fake news getting incorrectly labelled as genuine by the algorithms and countries using facial recognition software’s’ for mass surveillance ignoring the privacy concerns of individuals. What happens if a self-driving car runs over a man or a medical assistant incorrectly diagnoses the prescription for an illness that results in a fatality? Who is to be held accountable here? Can we prosecute an ML model? OR should we conduct a trial on the engineer or data scientist who developed it?
What is ethical AI ?
Check out this cool website from MIT that collects human perspectives on decisions made by an intelligent machine that checks moral dilemmas.
It’s imperative to ensure that the decisions taken by the models are transparent, preserving privacy, non discriminating, fair and auditable. AI and ML should never be left as a black box and we need research on mitigating the risks and using it responsibly.
In fact, we have institutions and organizations working along the lines of developing policies and decisions governing the AI ecosystem. In 2016, 74 sets of AI principles were published by the Berkman Klein Center at Harvard University. We also have schools like Stanford with their Human Centered AI and Oxford with the Stephen A. Schwarzman Centre for the Humanities, doing incredible work, if you would want to pursue more on the research and policies in this field.
On the other hand, we also have researchers working to write better algorithmic models and developing datasets to prevent biases.

Here’s one to which you can contribute to. EQUALITY, DIVERSITY, INCLUSION(EDI) is a workshop as part of EACL to bring together contributors to research into the inclusivity of gender, racial, sexual orientation, persons with a disability, and other minorities in language technologies with the aim to build and use datasets addressing the concerns of EDI.

Adhil has shared that they are running 4 new shared tasks. Check these links to know more on how to contribute.
Offensive language identification Link: https://lnkd.in/eqg7Pdm
Hope Speech Detection link: https://lnkd.in/eKse3tW
Meme classification link: https://lnkd.in/eaDM4Vd
Machine Translation link: https://lnkd.in/eiBskTz

Thanks, Praveen Sridhar for your inputs on ethics in AI!
— KJ

Upcoming Events 📆

Women in Data Science is having their first virtual meet in March 2021. Be an ambassador to host an event in your city. Check here for more info.
Amazon is conducting a Machine Learning challenge in HackerEarth for people having 2+ years of experience in ML and Python. This would be for the role of a Business Research Analyst.
PyConf Hyderabad is the regional gathering for the community that uses and develops the open-source Python programming language. The virtual conference is to be held on Dec 5-6th 2020. Register here to participate.