Bulletin #3 Friday 19th, January, 2024
|
Important Dates & Reminders
Friday, February 9, 2024 Last day to drop a FULL-TERM class for Winter via CAESAR. Requests after this date result in a W.
Monday, February 19, 2024 Registration for Spring 2024 begins
Saturday, March 9, 2024 Winter Classes End
Monday, March 11, 2024 Winter Examinations Begin
Saturday, March 16, 2024 Spring Break Begins
|
|
We want to hear from you! Please send any upcoming news and events to news@cs.northwestern.edu to be included in future bulletins &/featured on our socials/website.
Events must be emailed at least one (1) week in advance.
|
|
In this Issue
Upcoming Seminars:
Monday 22nd January
"On Data Ecology, Data Markets, the Value of Data, and Dataflow Governance" (Raul Castro Fernandez)
Wednesday 24th January
"TBA" (Nivedita Arora)
Friday 26th January
"Flexible and Faithful Federated Learning and Unlearning Methods" (Ermin Wei)
Monday 29th January
"Making Machine Learning Predictably Reliable" (Andrew Ilyas)
Wednesday 31st January
"Rethinking Data Use in Large Language Models" (Sewon Min)
CS Events:
Bagel Thursday | Jan 25
Northwestern Events
|
|
Missed a seminar? No worries! View past seminars via the Northwestern CS Website (northwestern login required).
|
|
January
22nd - Raul Castro Fernandez
24th - Nivedita Arora
26th - Ermin Wei
29th - Andrew Ilyas
31st - Sewon Min
February
5th - June Vuong
7th - Wanrong Zhang
12th - Bento Natura
14th - Abhishek Shetty
|
Monday / CS Seminar
January 22nd / 12:00 PM
Hybrid / Mudd 3514
|
|
|
On Data Ecology, Data Markets, the Value of Data, and Dataflow Governance
|
Abstract
Data shapes our social, economic, cultural, and technological environments. Data is valuable, so people seek it, inducing data to flow. The resulting dataflows distribute data and thus value. For example, large Internet companies profit from accessing data from their users, and engineers of large language models seek large and diverse data sources to train powerful models. It is possible to judge the impact of data in an environment by analyzing how the dataflows in that environment impact the participating agents. My research hypothesizes that it is also possible to design (better) data environments by controlling what dataflows materialize; not only can we analyze environments but also synthesize them. In this talk, I present the research agenda on “data ecology,” which seeks to build the principles, theory, algorithms, and systems to design beneficial data environments. I will also present examples of data environments my group has designed, including data markets for machine learning, data-sharing, and data integration. I will conclude by discussing the impact of dataflows in data governance and how the ideas are interwoven with the concepts of trust, privacy, and the elusive notion of “data value.” As part of the technical discussion, I will complement the data market designs with the design of a data escrow system that permits controlling dataflows.
Biography
In my research, I ask what is the value of data and explore the potential of data markets to unlock that value. My group collaborates with economists, legal scholars, statisticians, and domain scientists. We build systems to share, discover, prepare, integrate, and process data. I have traditionally worked on distributed query processing systems and continue to do so. I have received a SIGMOD'23 Test-of-time-Award. I am an assistant professor in the Department of Computer Science and on the Committee of Data Science at The University of Chicago. Before UChicago, I did a postdoc at MIT with Sam Madden and Mike Stonebraker. And before that, I completed a PhD at Imperial College London with Peter Pietzuch.
Zoom Link: https://northwestern.zoom.us/j/96564331079?pwd=Ukl2MFNzM2sxc2FSZi9Oa0hHUllYUT09
Panopto: https://northwestern.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=569d51db-14d0-4ae3-80ea-b0f5011ffc7c
Research Interests/Area
Data Science, Data Management, Systems
|
|
Monday / CS Seminar
January 29th / 12:00 PM
In Person / Mudd 3514
|
|
|
"Making Machine Learning Predictably Reliable"
|
Abstract
Despite ML models' impressive performance, training and deploying them is currently a somewhat messy endeavor. But does it have to be? In this talk, I overview my work on making ML “predictably reliable”---enabling developers to know when their models will work, when they will fail, and why.
To begin, we use a case study of adversarial inputs to show that human intuition can be a poor predictor of how ML models operate. Motivated by this, we present a line of work that aims to develop a precise understanding of the ML pipeline, combining statistical tools with large-scale experiments to characterize the role of each individual design choice: from how to collect data, to what dataset to train on, to what learning algorithm to use.
Biography
Andrew Ilyas is a PhD student at MIT, advised by Constantinos Daskalakis and Aleksander Madry. His main interest is in reliable machine learning, where he seeks to understand the effects of the individual design choices involved in building ML models. He was previously supported by an Open Philanthropy AI Fellowship.
Research Interests/Area
Machine Learning
|
|
Wednesday / CS Seminar
January 31st / 12:00 PM
In Person / Mudd 3514
|
|
|
"Rethinking Data Use in Large Language Models"
|
Abstract
Large language models (LMs) such as ChatGPT have revolutionized natural language processing and artificial intelligence more broadly. In this talk, I will discuss my research on understanding and advancing these models, centered around how they use the very large text corpora they are trained on. First, I will describe our efforts to understand how these models learn to perform new tasks after training, demonstrating that their so-called in context learning capabilities are almost entirely determined by what they learn from the training data, challenging a widely held belief. Next, I will introduce a new class of LMs that fundamentally rethink how models use their training data. These new models—nonparametric LMs—include not only learned parameters but also massive text corpora, from which they retrieve information for improved accuracy and flexibility. I will describe my work establishing the foundations for such models, showing they are more performant with fewer parameters and can easily stay up-to-date. I will also discuss how they open up new avenues for responsible data use, e.g., by segregating permissive and copyrighted text and using them differently. Finally, I will envision the next generation of LMs we should build, focusing on efficient scaling, better information-seeking, and responsible data use.
Biography
Sewon Min is a Ph.D. candidate in the Paul G. Allen School of Computer Science & Engineering at the University of Washington. Her research focuses on language models (LMs): studying the science of LMs, and designing new model classes and learning methods that make LMs more performant and flexible. She also studies LMs in information-seeking, legal, and privacy contexts. She is a co-organizer of multiple tutorials and workshops, including most recently at ACL 2023 on Retrieval-based Language Models and Applications and upcoming at ICLR 2024 on Mathematical and Empirical Understanding of Foundation Models. She won a paper award at ACL 2023 and a J.P. Morgan Fellowship.
Research Interests/Area
Natural language processing; Machine learning
|
|
|
Start your day off right with free bagels. Connect with fellow CS students & faculty on a cold winter morning.
|
|
|
|
|
|
|
|
|
Thursday, January 25th 2024; 9AM-11AM
|
|
|
|
RAISO | NYT vs OpenAI: LLMs and Copyright Infringement
|
How do AI creators decide what to train their models on? What right do chatbots have to republish work and information written by journalists?
Join RAISO this Monday (1/22) to discuss the current lawsuit between the New York Times and OpenAI and eat free empanadas!
|
|
|
|
|
|
|
|
|
Monday, January 22nd, 4:00 PM
|
|
|
The Balance Between Model Performance and Interpretability in Demand Forecasting
|
Center for Deep Learning Seminar:
Jeff Tackes
Sr. Manager of Data Science, Kraft Heinz
Thursday, February 1st, 4:00 PM – 5:00 PM
Ford Design Center, Room 1.350 (ITW Classroom)
2133 Sheridan Road, Evanston, IL 60208
|
|
|
|
|
|
|
|
Abstract: In the rapidly evolving landscape of demand forecasting, businesses increasingly push the forecast accuracy limits by looking to new “State of the Art” models to predict market trends and consumer behavior. However, the complexity of these models often comes at the cost of interpretability, posing significant challenges for data scientists and decision-makers. This talk aims to address the critical balance between leveraging the predictive power of complex, 'black box' models and maintaining the interpretability necessary for strategic business applications. Our discussion will begin by exploring the latest advancements in time series forecasting and discussing their strengths and limitations. We will delve into the intricacies of models that, while powerful, often function as black boxes, making it difficult to understand the 'why' behind their predictions. This opacity can hinder trust and adoption in business environments where understanding the reasoning behind forecasts is as crucial as the forecasts themselves.
To bridge this gap, we introduce SHAP (SHapley Additive exPlanations) values, a cutting-edge approach in model interpretability. SHAP values, grounded in cooperative game theory, provide a robust framework to decipher the contribution of each feature to the prediction of a complex model.
The talk will include real-world examples of how many businesses today decide on how to deploy machine learning into their business practices.
Bio: Jeff Tackes is Sr. Manager of Data Science for Kraft Heinz, headquartered in Chicago, IL. With over 10 years of industry experience, Jeff has developed a deep understanding of the intricacies of demand forecasting and has successfully built best-in-class demand forecasting systems for leading fortune 500 companies.
Jeff is known for his data-driven approach and innovative strategies that optimize forecasting models and improve business outcomes. He has led cross-functional teams in designing and implementing demand forecasting systems that have resulted in significant improvements in forecast accuracy, inventory optimization, and customer satisfaction. Jeff's expertise in statistical modeling, machine learning, and advanced analytics has enabled him to develop cutting-edge forecasting methodologies that have consistently outperformed industry standards. Jeff's strategic vision and ability to align demand forecasting with business goals have made him a trusted advisor to senior executives and a sought-after expert in the field.
|
|
Thursday, February 1st, 4:00 PM – 5:00 PM
|
|
|
Ford Design Center, Room 1.350 (ITW Classroom)
2133 Sheridan Road, Evanston, IL 60208
|
|
|
Towards Human-centered AI: How to Generate Useful Explanations for Human-AI Decision Making
|
The Technology & Social Behavior Ph.D. Program is excited to welcome Professor Chenhao Tan of University of Chicago to campus.
Professor Tan will give a talk entitled “Towards Human-centered AI: How to Generate Useful Explanations for Human-AI Decision Making” that will take place Thursday, February 8th from 4:00pm-5:00pm, with a reception to follow, in the Human-Computer Interaction + Design Center (Frances Searle Building, Room 1-122).
We welcome you to join!
|
|
|
|
|
|
|
|
|
Thursday, February 8th 2024; 4:00pm-5:00pm
|
|
|
Human-Computer Interaction + Design Center
(Frances Searle Building, Room 1-122)
|
|
|
|
Finnigan, Ankeny Named to MS Leadership Positions
|
Shelley Finnigan will serve as associate dean for master’s and professional education, while Casey Ankeny has been appointed assistant dean for TGS master’s programs.
Read More
|
|
|
|
New Book Presents Methodologies for Designing Embedded Systems Products
|
The textbook aims to guide undergraduate and graduate students in computer engineering, electrical engineering, and computer science who are pursuing an industry career in the design and implementation of embedded systems and Internet of Things applications.
Read More
|
|
|
|
Building Community Among Early-Career Researchers in Theoretical Computer Science
|
The Northwestern CS Theory Group and Toyota Technological Institute at Chicago co-hosted the Junior Theorists Workshop held November 30 – December 1.
Read More
|
|
|
|
|
Laura Brueck honored for collaboration, mentorship and promotion of global intellectual engagement
|
Provost Award for Exemplary Faculty Service recognizes service to Northwestern and academic citizenship
Read More
|
|
|
|
Jennifer Lackey and Marcelo Vinces receive Daniel I. Linzer Awards
|
Recognizing excellence in diversity, inclusivity and equity at Northwestern
Read More
|
|
|
|
© Robert R. McCormick School of Engineering and Applied Science, Northwestern University
|
|
Northwestern Department of Computer Science Mudd Hall, 2233 Tech Drive, Third Floor, Evanston, Illinois, 60208 Unsubscribe
|
|
|
|
|
|