Colloquium (C)

Roee Shraga - Recovering Data Semantics

In data science, it is increasingly the case that the main challenge is finding, curating, and understanding the data that is available to solve a problem at hand. Furthermore, modern-day data is challenging in that it lacks many forms of semantics ("meaning of data"). Metadata may be incomplete or unreliable, data sources are unknown, and data documentation rarely exists. To address these challenges, the objective of my research is to recover data semantics throughout data discovery, versioning, integration, and quality.

08/12/2022 - 13:30

Dor Atzmon - Planning Paths for Multiple Moving Agents

In recent years, finding a path for a moving agent has become relevant to many real-world applications, including autonomous driving cars, moving robots, and flying drones. Finding the optimal (shortest) path can be calculated by Heuristic Search algorithms, such as the well-known A* algorithm. These algorithms use heuristics, which estimate the distance to the destination, to intelligently guide the search and quickly return a solution. In automated warehouses and many other applications, multiple agents exist.

01/12/2022 - 13:30

Adam Poliak - Limits and Applications of Natural Language Processing

Natural Language Processing (NLP) is a branch of Artificial Intelligence that aims to build machines that humans can seamlessly interact with through spoken and written language. As NLP becomes more ubiquitous in our daily lives, through technologies like Google Translate and Apple’s Siri, understanding the limits of these systems is critical. In the first half of the talk, we will cover my work developing diagnostics that test such systems.

13/01/2022 - 13:30

Tal Wagner - On the Role of Data in Algorithm Design

Recently, there has been a growing interest in harnessing the power of big datasets and modern machine learning for designing new scalable algorithms. This invites us to rethink the role of data in algorithm design: not just as the input to pre-designed algorithms, but also a factor that enters the algorithm design process itself, driving it in a strong and possibly automated manner.

16/12/2021 - 11:30

Matan Hofree - Tissue level insights from cellular measurements – Discovery of multi-cellular hubs by co-variation analysis of single-cell expression

Therapy response varies considerably among cancer patients and depends on tumor intrinsic factors and interactions with the tumor microenvironment and the host immune system. One example is the response of colorectal cancer (CRC) patients to immune therapy, which varies considerably between patients with mismatch repair-deficient (MMRd) and mismatch-repair proficient tumors (MMRp).

16/12/2021 - 13:30

Talya Eden - Sublinear-time graph algorithms: motif counting and uniform sampling

In this talk I will survey recent developments in approximate subgraph-counting and subgraph-sampling in sublinear-time. Counting and sampling small subgraphs (aka motifs) are two of the most basic primitives in graph analysis, and have been studied extensively, both in theory and in practice. In my talk, I will present the sublinear-time computational model, where access to the input graph is given only through queries, and will explain some of the concepts that underlie my results.

06/01/2022 - 11:30

Alon Eden - Information in Mechanism Design

The modern economy is becoming highly digital, demanding a combination of both economic and computational techniques. The abundance of data and the unique nature of digital markets highlight the special role of information in today’s economic systems. In this talk, I will discuss two domains of interest. The first is auction design in the interdependent value (IDV) setting. In IDV settings, buyers hold partial information regarding the quality of the good being sold, but their actual values also depend on information that other buyers have about the good.

06/01/2022 - 13:30

Amir Gilad - Informed Data Science

Data science has become prevalent in various fields that affect day-to-day lives, such as healthcare, banking, and the job market. The process of developing data science applications usually consists of several automatic systems that manipulate and prepare the data in different manners. Examples of automatic data manipulations and preparations include generating synthetic data, exploring it, repairing the data, and labeling it for machine learning.

30/12/2021 - 11:30

Elior Sulem - Learning with Less Data and Labels for Language Acquisition and Understanding

Natural Language Processing has attracted a lot of interest in recent years and has seen large improvements with the use of contextualized neural language models. However, the progress is limited to specific tasks where large datasets are available and models are often brittle outside of these datasets. Also, current models are usually pretrained on extremely large unlabeled datasets, which limits our understanding of low-resource scenarios and makes their use unfeasible for many in the academia and industry.

30/12/2021 - 13:30

Tom Hope - Harnessing Scientific Literature for Boosting Discovery and Innovation

With millions of scientific papers coming out every year, researchers are forced to allocate their attention to increasingly narrow areas, creating isolated “research bubbles” that limit knowledge discovery. This fragmentation of science slows down progress and prevents the formation of cross-domain links that catalyze breakthroughs. Toward addressing these large-scale challenges for the future of science, my work develops new paradigms for searching, recommending and discovering scholarly knowledge.

09/12/2021 - 13:30