“The Rise of Natural Language Understanding – Professor Chris Potts and the Impact of Models GPT-3 and DaVinci 2/3”
- Chris Potts is a professor and chair of the Department of Linguistics, and also teaches a graduate course in Natural Language Understanding
- He has an interesting podcast, is running research papers and projects, and his expertise in NLU makes him a great resource
- NLU has seen incredible advancements since 2012 due to models like GPT-3 and DaVinci 2/3 that are accessible via API or open source
- These models have had increased societal impact, with many derived from them for code generation, search technologies, text to image generation, etc.
- Benchmarks for performance measurement have saturated faster than ever.
“The Rise of Mega Language Models: Navigating the World of 8.3 Billion Parameters and Beyond”
- Recent advances in large language models have made a mockery of the previous model sizes
- Progress has been rapid, with models from 2018 having around 100 million parameters which now look small compared to the present-day 8.3 billion parameter megatron model and the 500 billion parameter Palm model from Google
- The sheer scale of these models has led to a central question: how can researchers contribute if they do not have the resources to build their own large language models?
- There are many options for contributing such as retrieval augmented learning, creating better benchmarks, solving the last mile problem for productive applications and achieving faithful human interpretable explanations of how these models behave.
Exploring In-Context Learning: The Benefits of the Transformer Architecture and Self-Supervision
- In-context learning traces back to the GPT3 paper and differs from the standard supervised paradigm in artificial intelligence
- It makes use of the Transformer architecture and self-supervision, which is a powerful approach for acquiring representations of form and meaning from co-occurrence patterns
- It is still an open question why this works so well, with many researchers attempting to explain its success.
Unpacking the Success of AI Innovations: Static Word Representations and Large Contextual Language Models
- Large-scale pre-training has facilitated the rise of two important innovations
- Static word representations (e.g. Word2Vec and Glove) and large contextual language models (Elmo, Bert, GPT, and GPT3)
- Both are powerful tools enabling self-supervision and the release of parameters for others to build on. Human feedback and human effort have also been essential in making these models best in class, namely through instruct models which use binary distinctions about good generations and bad ones as well as a ranking system for model outputs
- This has helped reduce the magical feeling of how these models achieve so much. Finally, advanced prompting techniques help AI systems reason more logically and precisely by providing instructions on how to answer questions such as negation (e.g. if we didn’t eat any food then we didn’t eat any pizza). This is an example of step by step reasoning which helps bridge the gap into knowledge intensive tasks.
Exploring the Benefits and Challenges of Language Modeling for Question Answering
- The use of language models is an effective way to answer questions in a literal substring guarantee
- There is an alternative approach, the “LLMs for Everything” which has potential but also issues such as efficiency, updateability and trustworthiness
- This can be solved by using retrieval augmented NLP which makes use of dense numerical representations and standard information retrieval techniques to synthesize results into a single answer.
Exploring a New Framework for Lightweight AI Programming
- The current approach to designing AI systems is to use a pre-trained set of components and connect them together with task specific parameters
- The traditional approach often fails to create effective, integrated systems
- This has led to an emerging programming mode where large, pre-trained components are used to produce prompts which allow for message passing between them, creating entire AI systems that are entirely about dialogue between components
- A new paper, Demonstrate Search Predict (DSP), provides a framework for lightweight programming, allowing for maximum use of pre-trained components.
Unveiling the Potential Risk of NLP and AI Technologies
- NLP technology will cause disruption in laggard industries
- Artificial assistance will become more ubiquitous and AI writing assistance may be used for student papers
- Negative effects of AI and NLP, such as disinformation spread, market disruption and systemic bias, will be amplified.
“AI’s Surprising Progress in the Past Two Years: Superhuman Trustworthiness and Unsolved Questions to Come”
- Chris discussed his predictions for the next four years of AI technology
- He noted that much of what he predicted two years ago has already come true and he is surprised by the progress made with text image models and diffusions
- He noted that while AI technologies have become more efficient, they also require very large expenditures and can have an environmental impact
- Finally, he discussed how trustworthiness in these technologies may require them to be “superhuman” and how large language models may be able to answer questions that humans are not yet aware of.
“A Discussion on the Possibilities and Implications of Artificial Intelligence with Petra”
- Petra, a professional development student, discussed the implications and possibilities of Artificial Intelligence
- She suggested that individuals could combine their domain expertise with AI and make meaningful progress on a problem, rather than merely having demos
- Petra concluded by thanking the audience for joining her webinar, and requested feedback for future topics.