An Introduction to LLM Evaluation: How to measure the quality of LLMs, prompts, and outputs
Introduction In a previous article, we learned that prompting is how we communicate with LLMs, such...
The rise of new AI and ML tools brings the ability to tackle previously intractable problems in tech and beyond, and new job opportunities for software engineers who understand both. Codesmith has been teaching ML for seven years, and as the field has evolved dramatically over that time, so has our curriculum. Codesmith puts alums ahead of the curve as the tech industry pivots towards these nascent fields, empowering them to apply their existing skills and experience to make tech more equitable and inclusive.
Codesmith’s new AI/ML curriculum includes:
Codesmith expanded focus on ML in 2021, launching the Data Science & Machine Learning research group.
Alex Zai — Codesmith cofounder & faculty, former ML engineer at Amazon Web Services, and author of Deep Reinforcement Learning in Action — led 2021-23 on the research and engineering side with Jonathan Bechtel, Emily Kearney & Gerard Torrats-Espinoza (Assistant Professor at Columbia University and Data Science Institute Fellow), and were joined by Jared Lewis, Senior Engineer on the Codesmith team and Data Science & Machine Learning program.
“I led the creation of Codesmith’s Machine Learning curriculum building it out over 2 years culminating at the end of 2023 when I prepared to move on to an ML role in industry in late March 2024, a few months after the handover - when I brought on the excellent Jared Lewis. I’m excited to see the curriculum reach even more people in the years ahead as I move into industry”
Codesmith residents have also contributed to some of the most important tools in the ML/AI space, including Tensorflow.js, where they created preprocessing layers and interfaces used for computer vision, used by Alibaba tech team under the mentorship of Alex Zai.
Now, that curriculum is being integrated and expanded in the main software engineering & AI program.
Senior Curriculum Manager James Laff — who will be co-leading ‘Software Engineering in the Age of AI’ on Frontend Masters with Codesmith CEO and founder Will Sentance and is an expert in engineering pedagogy (having taught between them at the BBC, Google, Harvard, and Unilever among others) — explored the best approaches to instilling the mental models needed to truly understand this branch of engineering.
James was able to build on the work of Codesmith graduates who have been working in ML and AI for many years already.
Brandi Richardson, Codesmith alum, ex Microsoft, and now Google Software engineer built out an AI ChatBot used to communicate with patients, while at Microsoft. She explains that the chatbot “may not be able to communicate all of its emotions like a human being would, so my job was to humanize AI and create more of a connection.”
Codesmith’s curriculum is about teaching talented engineers to ‘learn how to learn’ including AI tools, but this is about systematizing that in the curriculum. The first question was identifying “how to build foundational mental models of areas that are evolving daily," says Codesmith CEO Will Sentance.
Codesmith’s pedagogy doesn’t require residents to learn every programming language that they might go on to use, but provides residents with the capacity to learn any language. This means instilling residents with the mindset and technical capacities to learn Python, Vue, or Angular when needed.
"AI and ML require augmenting the set of mental models Codesmith grads establish before they even enter the program to include neural networks, specifically large language models (LLMs)," Will explains.
Residents must understand how LLMs work, how they represent data, how they understand data, and what they do on a fundamental level.
“When all you have is a hammer, everything looks like a nail. If all you know is React you're very limited. But if you understand fundamentally what a tool is for, how it was designed, why to use it, and what other tools are available, you're ready to be a meaningful contributor on an engineering team at the mid to senior level. That’s what Codesmith’s curriculum instills in residents,” says James.
This is the reason behind the curriculum changes: to give residents the foundation to incorporate AI and ML into applications as the demand for them grows and their ability to solve previously unfeasible problems becomes clearer.
James explains how he managed to transform existing AI and ML research and current implementation into a workable curriculum for Codesmith’s residents.
"AI and ML require augmenting the set of mental models Codesmith grads establish before they even enter the program to include neural networks, specifically large language models (LLMs)" Will Sentance, Codesmith CEO and Co-founder
Balancing close review of the latest research papers with experimentation in production settings is essential to integrating AI approaches effectively.
“Research papers are the core tool of communication in a field that is built on a strong academic foundation. But then I also work with engineers in the field who are incorporating AI and ML into their workflow as developers or an actual application in production,” James says.
The clearest priority was to emphasize LLMs (built upon the transformer architecture), given that recent advancements have opened the door to their use by more engineers.
“Developments over the past 5-10 years — especially in transformer architecture and the self-attention mechanism driving LLMs — are remarkable. Instead of specialized models extensively trained to do one thing very well, we now have more general-purpose models that can do many things really well, fundamentally changing the skill set needed to use these tools,” James explains.
A custom model trained by an in-house team of MLEs or researchers isn’t always needed today, James explains: “Companies can either host an LLM themselves or pay for a cloud-hosted model, and their engineers can guide it to do what the company needs.”
This yields new opportunities for software engineers with under-the-hood knowledge and practical skills to interact with these models. James describes this as a “seismic shift” as it enables engineers to use these LLMs and provide companies, many outside the tech landscape, meaningful access to them.
“As software engineers, we are primarily concerned with how to use these tools in production. The different layers that are necessary to achieve that is the gulf that we're bridging.”
As Jonathan Bechtel says, “Codesmith has been pushing the open source community in AI tools forward for a number of years — with students of the Data Science & Machine Learning Research group contributing to Tensorflow.js in 2022-23.
“As new tools emerge we’re seeing these efforts come to fruition as all residents will have the opportunity to contribute to building open source tools for ML/AI workflows from this year.”
“Research papers are the core tool of communication in a field that is built on a strong academic foundation. But then I also work with engineers in the field who are incorporating AI and ML into their workflow as developers or an actual application in production” James Laff, Senior Curriculum Manager at Codesmith
In the history of software engineering, as new technologies and practices are embraced, new titles evolve organically to fill companies’ expectations and needs.
James says, “In AI and ML, there’s first a need for engineers — people who translate problems into solutions — who are able to use AI and integrate it into products. That’s crystallizing into a term: AI engineer.”
Codesmith alums are well prepared to step into this role given their exposure to these technologies and their work building tools for other developers.
Software engineers are central to AI and ML, as the real challenge of integrating models into applications are software engineering challenges at their core. This was the case with GitHub Copilot.
“The team behind it has said 90% of it is traditional software engineering. The challenges centered on testing, UI, latency — very much software engineering challenges — even though this is a tool that is heavily dependent on its AI integration.”
James mentions a Codesmith alum who attended the first LLM workshop. “She graduated last year, and was saying that she will be single-handedly responsible for building out the AI/ML component of her company’s application. It wasn't what she was hired for, but she's been entrusted with it.”
Alex Zai says “Codesmith prepares graduates for roles at the intersection of software engineering with the latest ML and AI workflows. The program covers core topics like mental models behind LLMs and a model pipeline, as well as the practical applications of these tools in projects and open source contributions.”
“Codesmith prepares graduates for roles at the intersection of software engineering with the latest ML and AI workflows” Alex Zai, Codesmith Co-founder and former ML engineer at AWS and author of Manning's Deep Reinforcement Learning.
The fusion engineer — a software engineer with a specialization in another discipline, such as law or education — is something becoming more and more in demand in the hiring market as companies look to leverage new technologies using engineers with experience in their domain.
“Domain expertise drastically improves the performance of your prompt for an LLM,” James explains. “For example, an engineer with legal training knows the words, framing, and reasoning process that should be reflected in the model’s response. This means they’re already light years ahead in terms of figuring out how to build a prompt that will produce that output.”
“This is not a nice to have, but a need to have, and we're going to see much more of it. It's incredibly challenging and different to traditional programming, but it's something that Codesmith alumni are especially well suited to given their backgrounds.
Concerns around lack of diversity and implicit biases in AI and ML models were front of mind when building out the new curriculum.
“There's a real lack of transparency around the pre-training data for these models. We don’t know exactly what biases exist or what has been done to mitigate them,” James says. “Many models were trained on Reddit — most Reddit content comes from white men between 18 and 29 — or Wikipedia, and around 85% of Wikipedia content is written by men.”
This gives Codesmith’s focus on driving diverse engineers to the top of tech added impetus, as it is both a need and an opportunity to mitigate biases while AI and ML are still nascent fields.
Reflecting on her work building out AI tools at Microsoft, Brandi says “technology, including AI, should be used for the advancement and benefit of human beings. With everything I strive to do, in the back of my mind, I always think, ‘how can I keep humans at the center of this technology?’”
Codesmith engineering writer Diana Cheung describes “We have to be mindful of bias and inaccuracy in the data we provide to the LLMs. The biases and inaccuracies in our input data may propagate into the LLM outputs. Remember that garbage in, garbage out.”
James Laff adds “We want people with an awareness of these issues and a commitment to addressing them in AI and ML leadership positions — we have a responsibility to ensure that these powerful technologies are implemented equitably to help those who need them most.”
“Technology, including AI, should be used for the advancement and benefit of human beings. With everything I strive to do, in the back of my mind, I always think, ‘how can I keep humans at the center of this technology?’” Brandi Richardson, Google Software Engineer
Introduction In a previous article, we learned that prompting is how we communicate with LLMs, such...
While prompt engineering may or may not be an actual discipline, prompts are what we use to...