Read about local events and news from IDEAL. Sign up for our Newsletter to stay up to date with events and announcements about IDEAL.
Exploring New Directions on Robustness in Machine Learning
Machine learning systems are widely deployed to facilitate decision-making. They’re used for many tasks — ranging from image and speech recognition, to medical diagnostics and electronic health record data mining, to securities trading and financial fraud detection — making it vital for the systems to be reliable and secure against adversarial corruptions.
Vijayaraghavan and Jason Hartline, professor of computer science in the McCormick School of Engineering, are co-directors of the Institute for Data, Econometrics, Algorithms, and Learning (IDEAL). Launched in 2019 by a team of interdisciplinary investigators at Northwestern, Toyota Technological Institute at Chicago, and the University of Chicago, IDEAL studies the theoretical foundations related to high dimensional data analysis, data science in strategic environments, and machine learning and optimization.
On November 16, IDEAL hosted a workshop focused on new directions on robustness in machine learning as part of the fall 2021 special quarter organized by Vijayaraghavan; Chao Gao, assistant professor of statistics at the University of Chicago; and Yu Cheng, assistant professor of mathematics at the University of Illinois at Chicago.
“During the fall special quarter, we studied some of the foundational questions about when and how we can design methods for machine learning and high-dimensional estimation that are robust and reliable,” said Vijayaraghavan.
Workshop speakers explored novel notions of robustness and the different challenges that arise in designing reliable and secure machine learning algorithms. Discussion topics included test-time robustness, adversarial perturbations, and distribution shifts.
Kamalika Chaudhuri, associate professor of computer science and engineering at the University of California San Diego, discussed the robustness of training algorithms to small, imperceptible perturbations to legitimate test inputs, or adversarial examples, that cause machine learning classifiers to misclassify.
Pranjal Awasthi, research scientist at Google, studies theoretical machine learning with a particular focus on designing robust algorithms for unsupervised learning.
Sébastien Bubeck, senior principal research manager for machine learning foundations at Microsoft Research, discussed joint research he conducted with Mark Sellke, a fourth-year graduate student in mathematics at Stanford University, that illustrates why robustness necessitates large neural networks
Aleksander Mądry, Cadence Design Systems Professor of Computing at MIT, presented a direct training data-to-output model that is a versatile framework for analyzing machine learning predictions.
Gautam Kamath, assistant professor of computer science at the University of Waterloo, specializes in robust statistics and data privacy. He surveyed different problems and results on differential privacy arising in the context of various statistical estimation settings.
Jinshuo Dong, IDEAL postdoctoral fellow, also helped organize the November event.
IDEAL’s next special quarter, “High Dimensional Data Analysis,” starts in spring 2022 and will include graduate courses, workshops, and reading groups. The spring quarter is being organized by Konstantin Makarychev, professor of computer science at Northwestern Engineering, and Yury Makarychev, professor of computer science at the Toyota Technological Institute at Chicago.
IDEAL is led by co-principal investigators from the three participating institutions. The Northwestern team also includes:
- Randall Berry, John A. Dever Chair of Electrical and Computer Engineering
- Dongning Guo, professor of electrical and computer engineering and (by courtesy) computer science
- Samir Khuller, Peter and Adrienne Barris Chair of Computer Science
- Zhaoran Wang, assistant professor of industrial engineering and management sciences and (by courtesy) computer science
- Eric Auerbach, assistant professor of economics at the Kellogg School of Management
- Ivan Canay, HSBC Research Professor of Economics at Kellogg
- Joel Horowitz, Charles E. and Emma H. Morrison Professor of Economics at Kellogg
IDEAL is a Harnessing the Data Revolution (HDR) Transdisciplinary Research in Principles of Data Science (TRIPODS) institute supported by the National Science Foundation under award CCF 1934931.
article link: https://www.mccormick.northwestern.edu/computer-science/news-events/news/articles/2022/exploring-new-directions-on-robustness-in-machine-learning.html
Jason Hartline Wins ACM SIGecom Test of Time Award
Professor Jason Hartline received the Association for Computing Machinery (ACM) SIGecom 2021 Test of Time Award, the association’s annual honor recognizing authors of influential papers at the intersection of economics and computation
The number of online marketplaces such as eBay, AirBnB, and Uber have rapidly increased since 2000. These marketplaces combine user behavior and preferences with market mechanisms mediated by technology platforms. The framework from Hartline’s paper and subsequent literature informs how these online marketplaces should be designed to perform well under a wide range of market conditions.
Hartline developed and wrote the paper in the summer of 1999 when he was a second-year PhD student at the University of Washington (UW) and advised by Anna Karlin, Bill and Melinda Gates Chair in Computer Science and Engineering at UW. During a summer internship at InterTrust Technologies’ STAR Lab, a small research lab at a digital rights management startup, he began working with Goldberg and Wright on digital goods and auctions questions.
In the paper, Hartline and co-authors identified a family of research questions at the interface between computer science and economics, presented a framework for analyzing mechanism designs that perform well, and provided methods that were useful for mechanism design questions.
This framework for designing and analyzing mechanisms that work well regardless of participant preferences created a subfield called “prior-free mechanism design,” which was very active between 2000-2010. Since then, the subfield has evolved into “prior-independent mechanism design” and a related field called “sample complexity of mechanism design.” The ACM Conference on Economics and Computation, the main conference for interdisciplinary research at the interface between computer science and economics, publishes papers on these topics annually.
“I think it was super lucky that I found — or more accurately, was found by — the topic of auctions before essentially anyone else in CS was thinking about them,” said Hartline. “I think there is huge value in taking on an odd project in an odd area. You might find what you are meant to do, and it might make a career for you.”
article link: https://www.mccormick.northwestern.edu/computer-science/news-events/news/articles/2021/jason-hartline-wins-acm-sigecom-test-of-time-award.html
Controlling Epidemic Spread: Reducing Economic Losses with Targeted Closures
IDEAL’s own Ozan Candogan (University of Chicago) was recently featured in a New York Times article for his research on methods of controlling the COVID-19 pandemic. He and his co-authors, John Birge (University of Chicago) and Yiding Feng (Northwestern University) utilized cell phone data and COVID-19 infection rates to propose a closure plan that minimizes both disease spread and economic impact. This work is an impressive example of both inter-institution collaboration and application of research, both of which are tenets of our institute.
In response to the COVID-19 pandemic, many cities have instituted uniform (city-wide) suspension of economic activity to varying degrees. However, the spread of the disease relies on human-to-human contact and has an inherent spatial nature, in which infected individuals potentially infect others in locations/neighborhoods they have visited. Birge, Candogan, Feng propose a spatial epidemic spread model, which explicitly accounts for the spillovers of infections across different neighborhoods in a city. In their model, the individuals who reside in a neighborhood may spend some of their time in another neighborhood. Susceptible individuals from a neighborhood “mix” with other individuals in any of the neighborhoods in which they spend time, and they can get infected there.
The authors study the decision problem of a social planner who can restrict the economic activity in different neighborhoods. The reduction in the permitted level of economic activity in a neighborhood (i) triggers an economic loss, and (ii) decreases the number of individuals who visit that neighborhood. The latter effect reduces the infections among individuals who reside in that neighborhood as well as those who reside elsewhere but spend time there. They provide a framework for controlling the spread of the epidemic in two regimes, accounting for scenarios where the number of infections is large and small. In the first regime, their approach yields targeted closure policies that reduce infections in all neighborhoods while inducing a minimal economic loss. In the second one, their policies ensure that a small number of initial infections will not trigger a large scale contagion, again while ensuring that the economic losses are minimized.
The authors then illustrate their approach with an application to New York City (NYC). They use mobile phone data to model population movements and COVID-19 infections numbers to capture the state of the disease. Their results indicate that appropriate targeting achieves a reduction in infections with up to 12%–27% lower economic cost (by enabling 4.12 – 5.75 times more economic activity) than uniform (citywide) closure policies. The optimal policy allows for economic activity in Midtown (due to its economic importance) while imposing closures in many neighborhoods of the city (to curb the spread of the disease). Contrary to what might be intuitively expected, neighborhoods with larger levels of infections should not necessarily be the ones targeted with the most stringent economic closure measures. In addition, they show that coordination among neighboring counties and states is extremely important. For instance, depending on the policy followed by neighboring counties it may become infeasible for NYC to prevent a contagion.
Read the full The New York Times article here or the working paper here.
Prof. Chao Gao receives 2021 IMS Tweedie New Researcher Award
March 15, 2021
IDEAL Prof. Chao Gao has been selected to receive the 2021 Tweedie New Researcher Award from the Institute of Mathematical Statistics (IMS), “For groundbreaking contributions to robust statistics, including establishing connections with generative adversarial networks, network analysis, and high-dimensional statistical inference.” He will present the Tweedie New Researcher Invited Lecture at the 2021 IMS New Researchers Conference. article link: https://stat.uchicago.edu/news/article/prof-chao-gao-2021-ims-tweedie-award/CHAO GAO, Assistant Professor, Department of Statistics and the College
Announcing IDEAL Postdoctoral Fellowships for 2020-2022
The institute invites applications for two postdoctoral fellowships starting Fall of 2020, to conduct inter-disciplinary research that focuses on the theoretical foundations of data science. One fellowship is based at the Toyota Technology Institute at Chicago (TTIC) and one fellowship is based at Northwestern University.
Candidates will be expected to spend at least one day a week at the other campus. Ideal candidates will have interests in several of the special quarters that will be run from Fall 2020 to Spring 2022. By default, applicants will be considered for both postdoctoral positions. If you have a strong preference for one of the participating institutions, please specify in your cover letter. We encourage candidates to send applications as soon as possible. Appointments begin Fall 2020 quarter. Applications received by February 22nd, 2020 will be given full consideration.
Information for postdoctoral fellowship applicants and other ways to participate in the institute are on the participation page.
Spring 2020 Kickoff Workshop
Research activities of the Special Quarter on Inference and Data Science on Networks kickoff on Tuesday, May 5th, with an 11-3pm virtual workshop. The Organizers of the special quarter will give short talks on topics for study during the quarter (details). Researchers interested in attending can register to join by Zoom, or livestream on Panopto (details). Members of the institute are invited to participate in an open problems session and are encouraged to join in the research activities of the special quarter.
IDEAL Spring 2020 Special Quarter on Inference and Data Science on Networks goes remote.
IDEAL Spring 2020 Special Quarter on Inference and Data Science on Networks goes remote. Three remote PhD courses kickoff on Monday. PhD students from other institutions can apply to join remotely as virtual predoctoral fellows (application details on the special quarter page). In mid-April, the organizers will be holding a virtual kickoff workshop to overview key research directions for the special quarter. Research (Ph.D. students, postdocs, faculty, and research scientists) interested in joining the research efforts of the quarter should attend the kickoff workshop for details. The previously scheduled workshops are being rescheduled as remote workshops. Details of these workshops will be available shortly on the special quarter page and the IDEAL calendar.
New Collaborative Institute Aims to Explore Theoretical Foundations of Data Science
Joining forces with leading Chicago-area research institutions, Northwestern Engineering and the Weinberg College of Arts and Sciences Department of Economics colaunched the Institute for Data, Econometrics, Algorithms, and Learning (IDEAL).
IDEAL is a multi-discipline (computer science, statistics, economics, electrical engineering, and operations research) and multi-institution (Northwestern University, Toyota Technological Institute at Chicago, and University of Chicago) institute focused on understanding key aspects of data science theory. Supported by the National Science Foundation HDR TRIPODS program, IDEAL aims to develop the foundations of data science by combining perspectives from algorithms, econometrics, and machine learning.