BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Department of Statistics and Operations Research - ECPv5.1.6//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Department of Statistics and Operations Research
X-ORIGINAL-URL:https://stat-or.unc.edu
X-WR-CALDESC:Events for Department of Statistics and Operations Research
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20190310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20191103T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20200308T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20201101T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190111T143000
DTEND;TZID=America/New_York:20190111T153000
DTSTAMP:20210515T195408
CREATED:20190102T184849Z
LAST-MODIFIED:20190102T220845Z
UID:3951-1547217000-1547220600@stat-or.unc.edu
SUMMARY:STOR Colloquium: Bikram Karmakar\, University of Pennsylvania
DESCRIPTION:Bikram Karmakar \nStatistics Department\, The Wharton School\,\nUniversity of Pennsylvania \n \nEvidence factors for observational studies: methodology\, computation and applications. \n \nObservational studies aim to elucidate cause-and-effect relationships from situations in which treatment is not randomly assigned. A sensitivity analysis for an observational study assesses how much bias\, due to non-random assignment of treatment\, would be necessary to change the conclusions of an analysis that assumes treatment assignment was effectively random. Causal conclusions gain strength from a demonstration that they are insensitive to small or moderate violations of non-random treatment assignment\, especially if that happens in each of several statistically independent analyses that depend upon very different assumptions. In particular\, causal conclusions gain strength when evidence factors concur and are insensitive to bias\, where a study is said to contain two or more evidence factors if it provides two or more tests of the null hypothesis of no treatment effect that would be (essentially) independent were there no effect. Previous work with evidence factors has not addressed the problem that they involve multiple testing and how to control the type-I error to obtain valid inference. We develop a powerful method for controlling the familywise error rate for sensitivity analyses with evidence factors. We show that the Bahadur efficiency of sensitivity analysis for the combined evidence is greater than for any one evidence factor alone\, so that even though using two or more evidence factors requires multiple testing\, a study is better off asymptotically using two or more evidence factors than just one factor. \n \nWe also develop methods to widen the applicability of evidence factors to various designs for causal assessment\, including designs with instrumental variables\, and case-control studies. Computationally however it is often very hard to build these designs optimally. Even the simplest addition to a one treatment-control comparison – a second comparison – creates design problems without polynomial-time solutions. We develop an “approximation algorithm” that provides a solution in polynomial time that is probably not much worse than the unattainable optimal solution. \n \nWe illustrate our methodological and computational developments for evidence factors in two observational studies: (i) the effect of exposure to radiation on solid cancers and (ii) the effect of having a side airbag on the chance of dying in a car crash. \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-bikram-karmakar/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190114T143000
DTEND;TZID=America/New_York:20190114T153000
DTSTAMP:20210515T195408
CREATED:20190102T185216Z
LAST-MODIFIED:20190102T223233Z
UID:3959-1547476200-1547479800@stat-or.unc.edu
SUMMARY:STOR Colloquium: Joshua Cape\, Johns Hopkins
DESCRIPTION:Joshua Cape\nJohns Hopkins University \n \nStatistical analysis and spectral methods for signal-plus-noise matrix models \n \nEstimating eigenvectors and principal subspaces is of fundamental importance for numerous problems in statistics\, data science\, and network analysis\, including covariance matrix estimation\, principal component analysis\, and community detection. For each of these problems\, we obtain foundational results that precisely quantify the local (e.g.\, entrywise) behavior of sample eigenvectors within the context of a unified signal-plus-noise matrix framework. Our methods and results collectively address eigenvector consistency and asymptotic normality\, decompositions of high-dimensional matrices\, Procrustes analysis\, deterministic perturbation bounds\, and real-data spectral clustering applications in connectomics. \n \n \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-joshua-cape/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190116T143000
DTEND;TZID=America/New_York:20190116T153000
DTSTAMP:20210515T195408
CREATED:20190102T185436Z
LAST-MODIFIED:20190102T223102Z
UID:3965-1547649000-1547652600@stat-or.unc.edu
SUMMARY:STOR Colloquium: Yanglei Song\, University of Illinios
DESCRIPTION:Yanglei Song\nUniversity of Illinois at Urbana-Champaign \n Asymptotically optimal multiple testing with streaming data \n \nThe problem of testing multiple hypotheses with streaming (sequential) data arises in diverse applications such as multi-channel signal processing\, surveillance systems\, multi-endpoint clinical trials\, and online surveys. In this talk\, we investigate the problem under two generalized error metrics. Under the first one\, the probability of at least k mistakes\, of any kind\, is controlled. Under the second\, the probabilities of at least k1 false positives and at least k2 false negatives are simultaneously controlled. For each formulation\, we characterize the optimal expected sample size to a first-order asymptotic approximation as the error probabilities vanish\, and propose a novel procedure that is asymptotically efficient under every signal configuration. These results are established when the data streams for the various hypotheses are independent and each local log-likelihood ratio statistic satisfies a certain law of large numbers. Further\, in the special case of iid observations\, we quantify the asymptotic gains of sequential sampling over fixed-sample size schemes. \n \n \n \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-yanglei-song/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190118T143000
DTEND;TZID=America/New_York:20190118T153000
DTSTAMP:20210515T195408
CREATED:20190102T185356Z
LAST-MODIFIED:20190108T194040Z
UID:3963-1547821800-1547825400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Vince Lyzinski\, Univ. of Massachusetts Amherst
DESCRIPTION:Vince Lyzinski\nThe University of Massachusetts Amherst \n \nGraph matching in edge-independent networks \n \nThe graph matching problem seeks to find an alignment between the vertex sets of two graphs that best preserves common structure across graphs. Here\, we consider the closely related problem of graph matchability: Given a latent alignment between the vertex sets of two graphs\, under what conditions will the solution to the graph matching optimization problem recover this alignment in the presence of shuffled vertex labels? We consider the problem of graph matchability in non-identically distributed networks\, and working in a general class of edge-independent network models\, we demonstrate that graph matchability is almost surely lost when matching the networks directly\, and is almost perfectly recovered when first centering the networks using Universal Singular Value Thresholding before matching. While there are currently no efficient algorithms for solving the graph matching problem in general\, these results nonetheless provide practical algorithmic guidance for approximately matching networks in both real and synthetic data applications. \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-vince-lyzinski/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190123T143000
DTEND;TZID=America/New_York:20190123T153000
DTSTAMP:20210515T195408
CREATED:20190102T185137Z
LAST-MODIFIED:20190116T144101Z
UID:3957-1548253800-1548257400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Jingshen Wang\, University of Michigan
DESCRIPTION:Jingshen Wang\n University of Michigan \n \nInference on Treatment Effects after Model Selection \n \nInferring cause-effect relationships between variables is of primary importance in many sciences. In this talk\, I will discuss two approaches for making valid inference on treatment effects when a large number of covariates are present. The first approach is to perform model selection and then to deliver inference based on the selected model. If the inference is made ignoring the randomness of the model selection process\, then there could be severe biases in estimating the parameters of interest. While the estimation bias in an under-fitted model is well understood\, I will address a lesser known bias that arises from an over-fitted model. The over-fitting bias can be eliminated through data splitting at the cost of statistical efficiency\, and I will propose a repeated data splitting approach to mitigate the efficiency loss. The second approach concerns the existing methods for debiased inference. I will show that the debiasing approach is an extension of OLS to high dimensions\, and that a careful bias analysis leads to an improvement to further control the bias. The comparison between these two approaches provides insights into their intrinsic bias-variance trade-off\, and I will show that the debiasing approach may lose efficiency in observational studies. \n \nThis is joint work with Xuming He and Gongjun Xu. \n \n \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-jingshen-wang/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190125T143000
DTEND;TZID=America/New_York:20190125T153000
DTSTAMP:20210515T195408
CREATED:20190102T184949Z
LAST-MODIFIED:20190117T205204Z
UID:3953-1548426600-1548430200@stat-or.unc.edu
SUMMARY:STOR Colloquium: Geoffrey Schiebinger\, MIT and Harvard
DESCRIPTION:Geoffrey Schiebinger \nThe Broad Institute of MIT and Harvard and \nthe MIT Statistics and Data Science Center \n \nTowards a mathematical theory of development \n \nIn this talk we introduce a mathematical model to describe temporal processes like embryonic development and cellular reprogramming. We consider stochastic processes in gene expression space to represent developing populations of cells\, and we use optimal transport to recover the temporal couplings of the process. We apply these ideas to study 315\,000 single-cell RNA-sequencing profiles collected at 40 time points over 18 days of reprogramming fibroblasts into induced pluripotent stem cells. To validate the optimal transport model\, we demonstrate that it can accurately predict developmental states at held-out time points. We construct a high-resolution map of reprogramming that rediscovers known features; uncovers new alternative cell fates including neural- and placental-like cells; predicts the origin and fate of any cell class; and implicates regulatory models in particular trajectories. Of these findings\, we highlight the transcription factor Obox6 and the paracrine signaling factor GDF9\, which we experimentally show enhance reprogramming efficiency. Our approach provides a general framework for investigating cellular differentiation\, and poses some interesting questions in theoretical statistics. \n \nBio: Geoffrey Schiebinger is a postdoctoral fellow in the MIT Center for Statistics and the Klarman Cell Observatory at the Broad Institute of MIT and Harvard. His postdoctoral mentors are Eric Lander\, Aviv Regev\, and Philippe Rigollet. Geoffrey has won numerous academic awards including a Career Award at the Scientific Interface from the Burroughs Welcome Fund\, and a prize for Best Contribution to the conference Statistical Challenges in Single Cell Analysis organized by ETH Zurich. Before coming to MIT\, Geoff studied Statistics at UC Berkeley\, where he earned his Ph.D. in May 2016 for a doctoral thesis on the Mathematics of Precision Measurement. He is fortunate to have been advised by Benjamin Recht\, and also work with Martin Wainwright\, Bin Yu\, and Aditya Guntuboyina. Geoffrey attended Stanford University from 2007 – 2011 for his undergraduate degree in mathematics with a minor in physics and master’s degree in electrical engineering. \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-geoffrey-schiebinger/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190128T143000
DTEND;TZID=America/New_York:20190128T153000
DTSTAMP:20210515T195408
CREATED:20190102T185044Z
LAST-MODIFIED:20190116T144345Z
UID:3955-1548685800-1548689400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Jeffrey Regier\, Univ. of California\, Berkeley
DESCRIPTION:Jeffrey Regier\nUniversity of California\, Berkeley \n\nStatistical Inference for Cataloging the Visible Universe \n \nA key task in astronomy is to locate astronomical objects in images and to characterize them according to physical parameters such as brightness\, color\, and morphology. This task\, known as cataloging\, is challenging for several reasons: many astronomical objects are much dimmer than the sky background\, labeled data is generally unavailable\, overlapping astronomical objects must be resolved collectively\, and the datasets are enormous — terabytes now\, petabytes soon. Existing approaches to cataloging are largely based on algorithmic software pipelines that lack an explicit inferential basis. In this talk\, present a new approach to cataloging based on inference in a fully specified probabilistic model. consider two inference procedures: one based on variational inference (VI) and another based on MCMC. A distributed implementation of VI\, written in Julia and run on a supercomputer\, achieves petascale performance — a first for any high-productivity programming language. The run is the largest-scale application of Bayesian inference reported to date. In an extension\, using new ideas from variational autoencoders and deep learning\, I avoid many of the traditional disadvantages of VI relative to MCMC\, and improve model fit. \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n
URL:https://stat-or.unc.edu/event/stor-colloquium-jeffrey-regier/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190130T143000
DTEND;TZID=America/New_York:20190130T153000
DTSTAMP:20210515T195408
CREATED:20190102T185259Z
LAST-MODIFIED:20190116T144153Z
UID:3961-1548858600-1548862200@stat-or.unc.edu
SUMMARY:STOR Colloquium: Pragya Sur\, Stanford University
DESCRIPTION:Pragya Sur\n Stanford University \n \nA modern maximum-likelihood approach for \nhigh-dimensional logistic regression \n \nLogistic regression is arguably the most widely used and studied non-linear model in statistics. Classical maximum-likelihood theory based statistical inference is ubiquitous in this context. This theory hinges on well-known fundamental results: (1) the maximum-likelihood-estimate (MLE) is asymptotically unbiased and normally distributed\, (2) its variability can be quantified via the inverse Fisher information\, and (3) the likelihood-ratio-test (LRT) is asymptotically a Chi-Squared. In this talk\, I will show that in the common modern setting where the number of features and the sample size are both large and comparable\, classical results are far from accurate. In fact\, (1) the MLE is biased\, (2) its variability is far greater than classical results\, and (3) the LRT is not distributed as a Chi-Square. Consequently\, p-values obtained based on classical theory are completely invalid in high dimensions. In turn\, I will propose a new theory that characterizes the asymptotic behavior of both the MLE and the LRT under some assumptions on the covariate distribution\, in a high-dimensional setting. Empirical evidence demonstrates that this asymptotic theory provides accurate inference in finite samples. Practical implementation of these results necessitates the estimation of a single scalar\, the overall signal strength\, and I will propose a procedure for estimating this parameter precisely. \nThis is based on joint work with Emmanuel Candes and Yuxin Chen. \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-pragya-sur/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190215T143000
DTEND;TZID=America/New_York:20190215T153000
DTSTAMP:20210515T195408
CREATED:20190201T212611Z
LAST-MODIFIED:20190208T164533Z
UID:3998-1550241000-1550244600@stat-or.unc.edu
SUMMARY:STOR Colloquium: Elina Robeva\, MIT
DESCRIPTION:Elina Robeva\nMassachusetts Institute of Technology \n\nMaximum likelihood estimation under total positivity \n \nNonparametric density estimation is a challenging statistical problem — in general the maximum likelihood estimate (MLE) does not even exist! Introducing shape constraints allows a path forward. In this talk I will discuss non-parametric density estimation under total positivity (i.e. log-supermodularity). Though they possess very special structure\, totally positive random variables are quite common in real world data and exhibit appealing mathematical properties. Given i.i.d. samples from a totally positive and log-concave distribution\, we prove that the MLE exists with probability one if there are at least 3 samples. We characterize the domain of the MLE\, and give algorithms to compute it. If the observations are 2-dimensional or binary\, we show that the logarithm of the MLE is a piecewise linear function and can be computed via a certain convex program. Finally\, I will discuss statistical guarantees for the convergence of the MLE\, and will conclude with a variety of further research directions. \n
URL:https://stat-or.unc.edu/event/graduate-student-seminar-haipeng-gao/
LOCATION:Hanes 120
CATEGORIES:Graduate Seminar,STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190220T143000
DTEND;TZID=America/New_York:20190220T153000
DTSTAMP:20210515T195408
CREATED:20190218T171003Z
LAST-MODIFIED:20190218T171003Z
UID:4021-1550673000-1550676600@stat-or.unc.edu
SUMMARY:STOR Colloquium: Abishek Sankararaman\, UT-Austin
DESCRIPTION:Abishek Sankararaman \nThe University of Texas at Austin \n \nInterference Queuing Networks and \nSpatial Birth-Death Processes \n \nMotivated by applications in wireless networks\, we consider a class of spatial birth death process\, both on the continuum and on the discrete space of grids. In the continuum\, we consider an interacting particle birth death process on a compact torus and study questions of stochastic stability. In the discrete setting\, we consider a countably infinite collection of coupled queues representing a large wireless network in the Euclidean space. There is one queue at each point of the d -dimensional integer grid. These queues have independent Poisson arrivals\, but are coupled through their service rates. The service discipline is of the processor sharing type\, with the service rate in each queue slowed down\, when the neighboring queues have a larger workload. More precisely\, the service rate is the signal to interference ratio of wireless network theory. This coupling is parameterized by a symmetric and translation invariant interference sequence. The dynamics is infinite dimensional Markov\, with each queue having a non compact state space. It is neither reversible nor asymptotically product form\, as in the mean-field setting. Coupling and percolation techniques are first used to show that this dynamics has well defined trajectories. Coupling from the past techniques of the Loynes’ type are then proposed to build its minimal stationary regime. This regime is the one obtained when starting from the all empty initial condition in the distant past. The rate conservation principle of Palm calculus is then used to identify the stability condition of this system\, namely the condition on the interference sequence and arrival rates guaranteeing the finiteness of this minimal regime. We show that the identified condition is also necessary in certain special cases and conjecture to be true in all cases. Remarkably\, the rate conservation principle also provides a closed form expression for its mean queue size. When the stability condition holds\, this minimal solution is the unique stationary regime\, provided it has finite second moments\, and this is the case if the arrival rate is small enough. In addition\, there exists a range of small initial conditions for which the dynamics is attracted to the minimal regime. Surprisingly however\, there exists another range of larger though finite initial conditions for which the dynamics diverges\, even though stability criterion holds. \n \nBased on papers – https://arxiv.org/pdf/1604.07884.pdf and https://arxiv.org/pdf/1710.09797.pdf.
URL:https://stat-or.unc.edu/event/stor-colloquium-abishek-sankararaman-ut-austin/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190301T143000
DTEND;TZID=America/New_York:20190301T153000
DTSTAMP:20210515T195408
CREATED:20190226T161801Z
LAST-MODIFIED:20190226T161801Z
UID:4026-1551450600-1551454200@stat-or.unc.edu
SUMMARY:STOR Colloquium: Tucker McElroy\, US Census Bureau
DESCRIPTION:Tucker McElroy\nU.S. Census Bureau \nCasting Vector Time Series: Forecasting\, Imputation\, and Signal Extraction In the Context of Big Data \n \nRecursive algorithms\, based upon the nested structure of Toeplitz covariance matrices arising from stationary processes\, are presented for the efficient computation of multi-step ahead forecast error covariances for nonstationary vector time series. Further\, we discuss time reversal to forecast the past\, and algorithms for imputation of missing values. These quantities are required to quantify multi-step ahead forecast error and signal extraction error. The methods are applied to daily retail data exhibiting trend dynamics and seasonality. \n \nShort bio: Dr. McElroy is Senior Time Series Mathematical Statistician at the U.S. Census Bureau\, where he has served the last 15 years as a researcher and consultant on seasonal adjustment problems. He has a B.A. from Columbia University (1996)\, and a Ph.D. in mathematics from University of California\, San Diego (2001).
URL:https://stat-or.unc.edu/event/stor-colloquium-tucker-mcelroy-us-census-bureau/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190304T143000
DTEND;TZID=America/New_York:20190304T153000
DTSTAMP:20210515T195408
CREATED:20190226T172921Z
LAST-MODIFIED:20190226T172921Z
UID:4028-1551709800-1551713400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Jonathan M. Lees\, UNC-Chapel Hill
DESCRIPTION:Jonathan M. Lees \nUniversity of North Carolina at Chapel Hill \n \nGeophysical Time Series Analysis on Volcanoes: \nCan we quantify non-linearity? \n\nMost geophysical processes are aperiodic noisy\, intermittent and transient. This requires specialized methods for time series analysis\, that seek patterns in time series that vary in space and time. I present here examples from research on exploding volcanoes that exhibit tremor that appears to be resonant but likely results from nonlinear feedback systems. The physical models for these observations are highly controversial\, so detailed analysis of patterns observed in the time series represents a significant challenge for understanding volcano dynamics. Seismic and Acoustic waves will be presented at a number of volcanoes\, including Karymsky (Kamchatka\, Russia)\, Santiaguito (Ecuador)\, Tungurahua (Ecuador) and others. Methods involving time-domain and frequency domain analysis\, wavelet transforms\, Hilbert-Huang empirical mode approaches and Bispectral decomposition will be illustrated.
URL:https://stat-or.unc.edu/event/stor-colloquium-jonathan-m-lees-unc-chapel-hill/
LOCATION:Hanes Hall
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190320T153000
DTEND;TZID=America/New_York:20190320T163000
DTSTAMP:20210515T195408
CREATED:20190220T221249Z
LAST-MODIFIED:20190220T221249Z
UID:4023-1553095800-1553099400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Yuan Liao\, Rutgers University
DESCRIPTION:Yuan Liao \nRutgers University \n \n \nFactor-Driven Two-Regime Regression using \nMixed Integer Programming \n \nWe propose a two-regime regression model where the switching between the regimes is driven by a vector of possibly unobservable factors. When the factors are latent\, we estimate them by the principal component analysis of a much larger data set. We show that the optimization problem can be reformulated as mixed integer optimization and present two alternative computational algorithms: (1) MI quadratic programming and (2) MI linear programming. We show that (1) is numerically equivalent to the original least squares problem\, but runs slowly. On the other hand\, (2) runs much faster\, and produces asymptotically equivalent estimators. We derive the asymptotic distributions of the resulting estimators\, and establish a phase transition that describes the effect of first stage factor estimation. \n \nThe paper can be downloaded from https://arxiv.org/abs/1810.11109
URL:https://stat-or.unc.edu/event/stor-colloquium-yuan-liao-rutgers-university/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190401T153000
DTEND;TZID=America/New_York:20190401T163000
DTSTAMP:20210515T195408
CREATED:20190312T134500Z
LAST-MODIFIED:20190312T192835Z
UID:4066-1554132600-1554136200@stat-or.unc.edu
SUMMARY:STOR Colloquium: Serhan Ziya\, UNC-Chapel Hill
DESCRIPTION:Serhan Ziya\nUniversity of North Carolina at Chapel Hill \n \nService operations with a focus in healthcare: a partial and \nsubjective overview of related research in STOR \n \nThis talk will provide an overview of some of the research projects in which the speaker is either currently an active participant or hopes to help initiate in the near future. The main goal is to create awareness and generate interest in the area of service operations particularly as it relates to healthcare\, and highlight opportunities for potential collaborations with different research groups within or outside STOR. The first half of the talk will be devoted to a relatively detailed presentation of an interdisciplinary research project on improving patient flow in emergency departments while the second half will consist of quick exposition of several other projects\, which have features that are likely be of interest to researchers with various methodological backgrounds. \n \nCollaborators on the patient flow project are Nilay Tanik Argon\, Tommy Bohrmann\, Wanyi Chen\, Benjamin Linthicum\, Kenny Lopiano\, Abhi Mehrotra\, and Debbie Travers. Collaborators on other projects will be introduced during the talk.
URL:https://stat-or.unc.edu/event/stor-colloquium-serhan-ziya-unc-chapel-hill/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190923T153000
DTEND;TZID=America/New_York:20190923T163000
DTSTAMP:20210515T195408
CREATED:20190830T192333Z
LAST-MODIFIED:20190830T192333Z
UID:4356-1569252600-1569256200@stat-or.unc.edu
SUMMARY:Colloquium: Peter J. Mucha\, UNC-Chapel Hill
DESCRIPTION:Peter J. Mucha \nThe University of North Carolina at Chapel Hill \nDepartment of Mathematics \nCommunities in Multilayer Networks \nCommunity detection describes the organization of a network in terms of patterns of connection\, identifying tightly connected structures known as communities. A wide variety of methods for community detection have been proposed\, with a number of software packages available for performing community detection. In the past decade\, there has been increased interest in multilayer networks\, a general framework that can be used to describe networks with multiple types of relationships\, that change in time\, or that network together multiple kinds of networks. We describe various generalizations of community detection to multilayer networks\, including results about detectability limits and a new post-processing procedure to explore the parameter space of multilayer modularity\, along with pointers to using community detection in applications.
URL:https://stat-or.unc.edu/event/colloquium-peter-j-mucha-unc-chapel-hill/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20191004T153000
DTEND;TZID=America/New_York:20191004T163000
DTSTAMP:20210515T195408
CREATED:20190926T175045Z
LAST-MODIFIED:20190930T151845Z
UID:4418-1570203000-1570206600@stat-or.unc.edu
SUMMARY:STOR Colloquium: Tong Wang\, University of Iowa
DESCRIPTION:Dr. Tong Wang \nTippie College of Business \nUniversity of Iowa \n \nHybrid Predictive Model: When an Interpretable Model Collaborates with a Black-box Model \n \nInterpretable machine learning has received increasing interest in recent years\, especially in domains where humans are involved in the decision-making process. However\, the possible loss of the task performance for gaining interpretability is often inevitable\, especially for large datasets or complicated tasks. This performance downgrade puts practitioners in a dilemma of choosing between a top-performing black-box model with no explanations and an interpretable model with unsatisfying task performance. In this work\, we propose a novel framework for building a Hybrid Predictive Model (HPM) that integrates an interpretable model with any black-box model to introduce interpretability in the decision-making process at no or low cost of the predictive accuracy. The interpretable model substitutes the black-box model on a subset of data where the black-box model is overkill or nearly overkill. We design a principled objective function that considers predictive accuracy\, model interpretability\, and model transparency\, which is the percentage of data processed by the interpretable model. This framework brings together the advantages of the high predictive performance of black-box models and the high interpretability of interpretable models. We instantiate the proposed framework with two types of models\, one using decision rules as the interpretable collaborator and one using linear models. For both models\, we develop customized training algorithms with theoretically grounded bounds to reduce computation. We test the hybrid predictive models on structured datasets and text data. In these experiments\, the interpretable models collaborate with state-of-the-art black-box models including ensemble models and neural networks. We propose to use efficient frontiers to characterize the trade-off between transparency and predictive performance. Results show that hybrid models are able to obtain transparency at no or low cost of predictive performance. \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n
URL:https://stat-or.unc.edu/event/stor-colloquium-tong-wang-university-of-iowa/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20191014T153000
DTEND;TZID=America/New_York:20191014T163000
DTSTAMP:20210515T195408
CREATED:20190911T120737Z
LAST-MODIFIED:20191009T184019Z
UID:4386-1571067000-1571070600@stat-or.unc.edu
SUMMARY:STOR Colloquium: Heping Zhang\, Yale
DESCRIPTION:Back to the Basics: Residuals and Diagnostics for Generalized Linear Models\nHeping Zhang \nSusan Dwight Bliss Professor of Biostatistics \nYale University School of Public Health \nOrdinal outcomes are common in scientific research and everyday practice\, and we often rely on regression models to make inference. A long-standing problem with such regression analyses is the lack of effective diagnostic tools for validating model assumptions. The difficulty arises from the fact that an ordinal variable has discrete values that are labeled with\, but not\, numerical values. The values merely represent ordered categories. In this paper\, we propose a surrogate approach to defining residuals for an ordinal outcome Y. The idea is to define a continuous variable S as a “surrogate” of Y and then obtain residuals based on S. For the general class of cumulative link regression models\, we study the residual’s theoretical and graphical properties. We show that the residual has null properties similar to those of the common residuals for continuous outcomes. Our numerical studies demonstrate that the residual has power to detect misspecification with respect to 1) mean structures; 2) link functions; 3) heteroscedasticity; 4) proportionality; and 5) mixed populations. The proposed residual also enables us to develop numeric measures for goodness-of-fit using classical distance notions. Our results suggest that compared to a previously defined residual\, our residual can reveal deeper insights into model diagnostics. We stress that this work focuses on residual analysis\, rather than hypothesis testing. The latter has limited utility as it only provides a single p-value\, whereas our residual can reveal what components of the model are misspecified and advise how to make improvements. \nThis is a joint work with Dungang Liu\, University of Cincinnati Lindner College of Business. \nThe entire article can be viewed here.
URL:https://stat-or.unc.edu/event/stor-colloquium-heping-zhang-yale/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20191021T153000
DTEND;TZID=America/New_York:20191021T163000
DTSTAMP:20210515T195408
CREATED:20190926T175335Z
LAST-MODIFIED:20191015T195648Z
UID:4422-1571671800-1571675400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Shujie Ma\, UC Riverside
DESCRIPTION:Shujie Ma \nUniversity of California\, Riverside \n \nHow many communities are there in a network? \n \nAdvances in modern technology have facilitated the collection of network data which emerge in many fields including biology\, bioinformatics\, physics\, economics\, sociology and so forth. Network data often have natural communities which are groups of interacting objects (i.e.\, nodes); pairs of nodes in the same group tend to interact more than pairs belonging to different groups. Community detection then becomes a very important task\, allowing us to identify and understand the structure of a network. Thus\, the development of methods for community detection has attracted much attention in the past decade\, and as a result\, different efficient approaches have been proposed in literature. \n \nA fundamental limitation of most existing methods is that they divide networks into a fixed number of communities\, i.e.\, the number of communities is known and given in advance. However\, in practice\, such prior information is typically unavailable. Determining the number of communities is a challenging yet important task\, as the following community detection procedure relies upon it. In this talk\, I will introduce a convenient and effective solution to this problem under the degree-corrected stochastic block models (DC-SBM). The proposed method takes advantages of spectral clustering\, likelihood principle and binary segmentation. Determining the number of communities is essentially a model selection problem\, and we therefore establish the selection consistency of our proposed procedure under a mild condition on the average degree. We demonstrate the approach on different networks. At the end of my talk\, I will briefly talk about our other on-going and future research projects in this line of work. \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-shujie-ma-uc-riverside/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20191106T153000
DTEND;TZID=America/New_York:20191106T163000
DTSTAMP:20210515T195408
CREATED:20190903T184239Z
LAST-MODIFIED:20191025T145704Z
UID:4358-1573054200-1573057800@stat-or.unc.edu
SUMMARY:STOR Colloquium: Jianqing Fan\, Princeton
DESCRIPTION:Jianqing Fan \nPrinceton University \n \nStatistical Inference on Membership Profiles in Large Networks \n \nNetwork data is prevalent in many contemporary big data applications in which a common interest is to unveil important latent links between different pairs of nodes. The nodes can be broadly defined such as individuals\, economic entities\, documents\, or medical disorders in social\, economic\, text\, or health networks. Yet a simple question of how to precisely quantify the statistical uncertainty associated with the identification of latent links still remains largely unexplored. In this talk\, we suggest the method of statistical inference on membership profiles in large networks (SIMPLE) in the setting of degree-corrected mixed membership model\, where the null hypothesis assumes that the pair of nodes share the same profile of community memberships. In the simpler case of no degree heterogeneity\, the model reduces to the mixed membership model and an alternative more robust test is proposed. Under some mild regularity conditions\, we establish the exact limiting distributions of the two forms of SIMPLE test statistics under the null hypothesis and their asymptotic properties under the alternative hypothesis. Both forms of SIMPLE tests are pivotal and have asymptotic size at the desired level and asymptotic power one. The advantages and practical utility of our new method in terms of both size and power are demonstrated through several simulation examples and real network applications. \n(Joint work with Yingying Fan\, Xiao Han\, and Jinchi Lv) \n \nThe talk is based on the following paper on arxiv.org \nFan\, J.\, Fan\, Y.\, Han\, X. and Lv\, J. (2019). SIMPLE: Statistical Inference \non Membership Profiles in Large Networks
URL:https://stat-or.unc.edu/event/stor-colloquium-jainqing-fan-princeton/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20200108T153000
DTEND;TZID=America/New_York:20200108T163000
DTSTAMP:20210515T195408
CREATED:20200103T194647Z
LAST-MODIFIED:20200103T194647Z
UID:5309-1578497400-1578501000@stat-or.unc.edu
SUMMARY:STOR Colloquium: Anna Little\, Michigan State University
DESCRIPTION:Robust Statistical Procedures for Noisy\, \nHigh-dimensional Data \n \nThis talk addresses two topics related to robust statistical procedures for analyzing noisy\, high-dimensional data: (I) path-based spectral clustering and (II) robust multi-reference alignment. Both methods must overcome a large ambient dimension and lots of noise to extract the relevant low dimensional data structure in a computationally efficient way. In (I)\, the goal is to partition the data into meaningful groups\, and this is achieved by a novel approach which combines a data driven metric with graph-based clustering. Using a data driven metric allows for strong theoretical guarantees and fast algorithms when clusters concentrate around low-dimensional sets. In (II)\, the goal is to recover a hidden signal from many noisy observations of the hidden signal\, where each noisy observation includes a random translation\, a random dilation\, and high additive noise. A wavelet based approach is used to apply a data-driven\, nonlinear unbiasing procedure\, so that the estimate of the hidden signal is robust to high frequency perturbations. \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n
URL:https://stat-or.unc.edu/event/stor-colloquium-anna-little-michigan-state-university/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20200110T153000
DTEND;TZID=America/New_York:20200110T163000
DTSTAMP:20210515T195408
CREATED:20200106T161007Z
LAST-MODIFIED:20200106T161007Z
UID:5311-1578670200-1578673800@stat-or.unc.edu
SUMMARY:STOR Colloquium: Yao Li\, UC Davis
DESCRIPTION:On the Robustness of Machine Learning Systems \n \nDeep neural networks (DNNs) are one of the most prominent technologies of our time\, as they achieve state-of-the-art performance in many machine learning tasks\, including but not limited to image classification\, text mining\, and speech processing. However\, recent studies have demonstrated the vulnerability of deep neural networks against adversarial examples\, i.e.\, examples that are carefully crafted to fool a well-trained deep neural network while being indistinguishable from the natural images to humans. This makes it unsafe to apply neural networks in security-critical applications. We present two algorithms\, Adversarial Bayesian Neural Network (Adv-BNN) and Embedding Regularized Classifier (ER-Classifier)\, to train robust neural networks against adversarial examples. Motivated by the ideas that randomness and adversarial training can improve the robustness of neural networks\, we formulate the min-max problem in Bayesian Neural Network to learn the best model distribution under adversarial attacks\, leading to an adversarial-trained Bayesian neural network. Another algorithm\, ER-Classifier\, is inspired by the observation that the intrinsic dimension of image data is much smaller than its pixel space dimension and the vulnerability of neural networks grows with the input dimension. We propose to embed high-dimensional input images to a low-dimensional space to perform classification and regularize the embedding space in training process. Experimental results on several benchmark datasets show that our proposed frameworks achieve state-of-the-art performance against strong adversarial attack methods. \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n
URL:https://stat-or.unc.edu/event/stor-colloquium-yao-li-uc-davis/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20200113T153000
DTEND;TZID=America/New_York:20200113T163000
DTSTAMP:20210515T195408
CREATED:20200108T182804Z
LAST-MODIFIED:20200108T182804Z
UID:5318-1578929400-1578933000@stat-or.unc.edu
SUMMARY:STOR Colloquium: Paromita Dubey\, UC-Davis
DESCRIPTION:Fréchet Change Point Detection \nChange point detection is a popular tool for identifying locations in a data sequence where an abrupt change occurs in the data distribution and has been widely studied for Euclidean data. Modern data very often is non-Euclidean\, for example distribution valued data or network data. Change point detection is a challenging problem when the underlying data space is a metric space where one does not have basic algebraic operations like addition of the data points and scalar multiplication. \nIn this talk\, I propose a method to infer the presence and location of change points in the distribution of a sequence of independent data taking values in a general metric space. Change points are viewed as locations at which the distribution of the data sequence changes abruptly in terms of either its Fréchet mean or Fréchet variance or both. The proposed method is based on comparisons of Fréchet variances before and after putative change point locations. First\, I will establish that under the null hypothesis of no change point the limit distribution of the proposed scan function is the square of a standardized Brownian Bridge. It is well known that such convergence is rather slow in moderate to high dimensions. For more accurate results in finite sample applications\, I will provide a theoretically justified bootstrap-based scheme for testing the presence of change points. Next\, I will show that when a change point exists\, (1) the proposed test is consistent under contiguous alternatives and (2) the estimated location of the change-point is consistent. All of the above results hold for a broad class of metric spaces under mild entropy conditions. Examples include the space of univariate probability distributions and the space of graph Laplacians for networks. I will illustrate the efficacy of the proposed approach in empirical studies and in real data applications with sequences of maternal fertility distributions. Finally\, I will talk about some future extensions and other related research directions\, for instance\, when one has samples of dynamic metric space data. This talk is based on joint work with Prof. Hans-Georg Müller. \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n
URL:https://stat-or.unc.edu/event/stor-colloquium-paromita-dubey-uc-davis/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20200115T153000
DTEND;TZID=America/New_York:20200115T163000
DTSTAMP:20210515T195408
CREATED:20200110T190422Z
LAST-MODIFIED:20200110T190422Z
UID:5323-1579102200-1579105800@stat-or.unc.edu
SUMMARY:STOR Colloquium: Yuqi Gu\, University of Michigan
DESCRIPTION:Uncover Hidden Fine-Grained Scientific Information: Structured Latent Attribute Models \n \nIn modern psychological and biomedical research with diagnostic purposes\, scientists often formulate the key task as inferring the fine-grained latent information under structural constraints. These structural constraints usually come from the domain experts’ prior knowledge or insight. The emerging family of Structured Latent Attribute Models (SLAMs) accommodate these modeling needs and have received substantial attention in psychology\, education\, and epidemiology. SLAMs bring exciting opportunities and unique challenges. In particular\, with high-dimensional discrete latent attributes and structural constraints encoded by a design matrix\, one needs to balance the gain in the model’s explanatory power and interpretability\, against the difficulty of understanding and handling the complex model structure. \n \nIn the first part of this talk\, I present identifiability results that advance the theoretical knowledge of how the design matrix influences the estimability of SLAMs. The new identifiability conditions guide real-world practices of designing diagnostic tests and also lay the foundation for drawing valid statistical conclusions. In the second part\, I introduce a statistically consistent penalized likelihood approach to selecting significant latent patterns in the population. I also propose a scalable computational method. These developments explore an exponentially large model space involving many discrete latent variables\, and they address the estimation and computation challenges of high-dimensional SLAMs arising from large-scale scientific measurements. The application of the proposed methodology to the data from an international educational assessment reveals meaningful knowledge structure of the student population. \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-yuqi-gu-university-of-michigan/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20200122T153000
DTEND;TZID=America/New_York:20200122T163000
DTSTAMP:20210515T195408
CREATED:20200110T190524Z
LAST-MODIFIED:20200110T190524Z
UID:5325-1579707000-1579710600@stat-or.unc.edu
SUMMARY:STOR Colloquium: Lan Luo\, University of Michigan
DESCRIPTION:Lan Luo \nUniversity of Michigan \n \nRenewable Estimation and Incremental Inference \nin Streaming Data Analysis \n \nNew data collection and storage technologies have given rise to a new field of streaming data analytics\, including real-time statistical methodology for online data analyses. Streaming data refers to high-throughput recordings with large volumes of observations gathered sequentially and perpetually over time. Such data collection scheme is pervasive not only in biomedical sciences such as mobile health\, but also in other fields such as IT\, finance\, service and operations etc. This talk primarily concerns the development of a real-time statistical estimation and inference method for regression analysis\, with a particular objective of addressing challenges in streaming data storage and computational efficiency. Termed as “renewable estimation”\, this method enjoys strong theoretical guarantees\, including asymptotic consistency and statistical efficiency\, as well as fast computational speed. The key technical novelty pertains to the fact that the proposed method uses current data and summary statistics of historical data. The proposed algorithm will be demonstrated in generalized linear models (GLM) for cross-sectional data and quadratic inference functions (QIF) for correlated data. I will discuss both conceptual understanding and theoretical guarantees of the method and illustrate its performance via numerical examples. This is joint work with my supervisor Professor Peter Song. \n \n \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-lan-luo-university-of-michigan/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20200124T153000
DTEND;TZID=America/New_York:20200124T163000
DTSTAMP:20210515T195408
CREATED:20200117T194902Z
LAST-MODIFIED:20200117T194902Z
UID:5330-1579879800-1579883400@stat-or.unc.edu
SUMMARY:STOR/Computational Med colloquium: Zhengwu Zhang\, University of Rochester
DESCRIPTION:Zhengwu Zhang \nUniversity of Rochester \nStatistical Analysis of Brain Structural Connectomes \n \nThere have been remarkable advances in imaging technology\, used routinely and pervasively in many human studies\, that non-invasively measures human brain structure and function. Among them\, a particular imaging modality called diffusion magnetic resonance imaging (dMRI) is used to infer shapes of millions of white matter fiber tracts that act as highways for neural activity and communication across the brain. The collection of interconnected fiber tracts is referred to as the brain connectome. There is increasing evidence that an individual’s brain connectome plays a fundamental role in cognitive functioning\, behavior\, and the risk of developing mental disorders. Improved mechanistic understanding of relationships between brain connectome structure and phenotypes is critical to the prevention and treatment of mental disorders. However\, progress in this area has been limited duo to the complexity of the data. In this talk\, I will present challenges of analyzing such data and our recent progress\, including connectome reconstruction and novel statistical modeling methods. \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-computational-med-colloquium-zhengwu-zhang-university-of-rochester/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20200131T153000
DTEND;TZID=America/New_York:20200131T163000
DTSTAMP:20210515T195408
CREATED:20200124T164951Z
LAST-MODIFIED:20200124T164951Z
UID:5338-1580484600-1580488200@stat-or.unc.edu
SUMMARY:STOR Colloquium: Eric Lock\, University of Minnesota
DESCRIPTION:Eric Lock \nUniversity of Minnesota School of Public Health \nBidimensional Linked Matrix Decomposition for \nPan-Omics Pan-Cancer Analysis \nSeveral recent methods address the integrative dimension reduction and decomposition of linked high‐content data matrices. Typically\, these methods consider one dimension\, rows or columns\, that is shared among the matrices. This shared dimension may represent common features measured for different sample sets (horizontal integration) or a common sample set with features from different platforms (vertical integration). This is limiting for data that take the form of bidimensionally linked matrices\, e.g.\, multiple molecular omics platforms measured for multiple sample cohorts\, which are increasingly common in biomedical studies. We propose a flexible approach to the simultaneous factorization and decomposition of variation across bidimensionally linked matrices\, BIDIFAC+. This decomposes variation into a series of low-rank components that may be shared across any number of row sets (e.g.\, omics platforms) or column sets (e.g.\, sample cohorts). Our objective function extends nuclear norm penalization\, is motivated by random matrix theory\, and can be shown to give the mode of a Bayesian posterior distribution. We apply the method to pan-omics pan-cancer data from The Cancer Genome Atlas (TCGA)\, integrating data from 4 different omics platforms and 29 different cancer types \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-eric-lock-university-of-minnesota/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20200921T160000
DTEND;TZID=America/New_York:20200921T171500
DTSTAMP:20210515T195408
CREATED:20200904T130214Z
LAST-MODIFIED:20200908T155629Z
UID:5588-1600704000-1600708500@stat-or.unc.edu
SUMMARY:STOR Colloquium: Themis Sapsis\, MIT
DESCRIPTION:Output-Weighted Active Sampling for Bayesian Uncertainty Quantification and Prediction of Rare Events \nThemis Sapsis \nWe introduce a class of acquisition functions for sample selection that leads to faster convergence in applications related to Bayesian uncertainty quantification of rare events. The approach follows the paradigm of active learning\, whereby existing samples of a black-box function are utilized to optimize the next most informative sample. The proposed method aims to take advantage of the fact that some input directions of the black-box function have a larger impact on the output than others\, which is important especially for systems exhibiting rare and extreme events. The acquisition functions introduced in this work leverage the properties of the likelihood ratio\, a quantity that acts as a probabilistic sampling weight and guides the active-learning algorithm towards regions of the input space that are deemed most relevant. We demonstrate superiority of the proposed approach in the uncertainty quantification of a hydrological system as well as the probabilistic quantification of rare events in dynamical systems and the identification of their precursors. We also discuss connections and implications for Bayesian optimization and present applications related to path planning for anomaly (rare event) detection in environment exploration. \n \nJoint work with Dr Antoine Blanchard
URL:https://stat-or.unc.edu/event/stor-colloquium-themis-sapsis-mit/
LOCATION:Hanes Hall
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20201005T160000
DTEND;TZID=America/New_York:20201005T171500
DTSTAMP:20210515T195408
CREATED:20200817T174044Z
LAST-MODIFIED:20200925T192924Z
UID:5574-1601913600-1601918100@stat-or.unc.edu
SUMMARY:STOR Colloquium: Patrick Combettes\, NCSU
DESCRIPTION:Patrick Louis Combettes\nNorth Carolina State University \n\nPerspective Functions and Applications \nIn this talk I will discuss mathematical and computational issues pertaining to perspective functions\, a powerful concept that permits to extend a convex function to a jointly convex one in terms of an additional scale variable. Applications in inverse problems and statistics will be presented.
URL:https://stat-or.unc.edu/event/stor-colloquium-patrick-combettes-ncsu/
LOCATION:Hanes Hall
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20201019T160000
DTEND;TZID=America/New_York:20201019T171500
DTSTAMP:20210515T195408
CREATED:20200904T125346Z
LAST-MODIFIED:20200925T193017Z
UID:5582-1603123200-1603127700@stat-or.unc.edu
SUMMARY:STOR Colloquium: Lihua Lei\, Stanford
DESCRIPTION:Lihua Lei\nStanford University \n\nHierarchical Community Detection for Heterogeneous and \nMulti-scaled Networks \n \nReal-world networks are often hierarchical\, heterogeneous\, and multi-scaled\, while the idealized stochastic block models that are extensively studied in the literature tend to be over-simplified. In a line of work\, we propose several top-down recursive partitioning algorithms which start with the entire network and divide the nodes into two communities by certain spectral clustering methods repeatedly\, until a stopping rule indicates no further community structures. For these algorithms\, the number of communities does not need to be known a priori or estimated consistently. On a broad class of hierarchical network models motivated by Clauset\, Moore and Newman (2008)\, in which the communities are allowed to be heterogeneous and multi-scaled in terms of the size and link probabilities\, our algorithms are proved to achieve the exact recovery for sparse networks with expected node degrees logarithmic in the network size\, and are computationally more efficient than non-hierarchical spectral clustering algorithms. More interestingly\, we identify regimes where no algorithm can recover all communities simultaneously while our algorithm can still recover the mega-communities (unions of communities defined by the hierarchy) consistently without recovering the finest structure. Our theoretical results are based on my newly developed two-to-infinity eigenspace perturbation theory for binary random matrices with independent or dependent entries.
URL:https://stat-or.unc.edu/event/stor-colloquium-lihua-lei-stanford/
LOCATION:Hanes Hall
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20201102T160000
DTEND;TZID=America/New_York:20201102T171500
DTSTAMP:20210515T195408
CREATED:20200904T125458Z
LAST-MODIFIED:20201027T130943Z
UID:5584-1604332800-1604337300@stat-or.unc.edu
SUMMARY:STOR Colloquium: Jacob Bien\, USC
DESCRIPTION:Jacob Bien\nUniversity of Southern California \n\nTree-Based Aggregation of Rare Features for Prediction \n \nIt is common in modern prediction problems for many features to be counts of rarely occurring events. The challenge posed by such “rare features” has received little attention despite its prevalence in diverse areas\, ranging from biology (e.g.\, rare species within a microbiome) to natural language processing (e.g.\, rare words within an online hotel review). We show\, both theoretically and empirically\, that not explicitly accounting for the rareness of features can greatly reduce the effectiveness of an analysis. We next propose a framework for aggregating rare features into denser features in a flexible manner that creates better predictors of the response. Applications to the microbiome and to online hotel reviews show how our methodology is useful in a wide range of contexts.
URL:https://stat-or.unc.edu/event/stor-colloquium-jacob-bein-usc/
LOCATION:Hanes Hall
CATEGORIES:STOR Colloquium
END:VEVENT
END:VCALENDAR