BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Department of Statistics and Operations Research - ECPv5.1.6//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Department of Statistics and Operations Research
X-ORIGINAL-URL:https://stat-or.unc.edu
X-WR-CALDESC:Events for Department of Statistics and Operations Research
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20180311T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20181104T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20190310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20191103T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20180827T153000
DTEND;TZID=America/New_York:20180827T163000
DTSTAMP:20210515T191025
CREATED:20180827T134759Z
LAST-MODIFIED:20180827T134759Z
UID:3811-1535383800-1535387400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Richard Smith\, UNC-CH
DESCRIPTION:The Department of \nStatistics and Operations Research \nThe University of North Carolina at Chapel Hill \n \n \nSTOR Colloquium \nMonday\, August 27th\, 2018 \n120 Hanes Hall \n3:30pm\n\nRichard Smith\nUniversity of North Carolina-Chapel Hill\n\nThe First Anniversary of Hurricane Harvey \n \nThis weekend marks exactly one year since Hurricane Harvey devastated much of the Caribbean and then dumped record levels of rainfall on the city of Houston and its environs. This event also stimulated much scientific work focused on two major issues of a statistical nature\, (a) trying to assess just how extreme the Harvey rainfalls were\, and (b) assessing to what extent the extremeness of Harvey could be attributed to anthropogenic climate change\, and the related question of how common such events are likely to be in the future. In joint work with Ken Kunkel of the North Carolina Institute of Climate Studies\, I have fitted extreme value models to extreme precipitation events in the southeast US and have linked these both to increasing temperatures in the Gulf of Mexico and to global levels of carbon dioxide\, both of which have anthropogenic origins. To complement these applied topics\, I will also review more theoretical developments stimulated by similar problems. These are (i) extreme value theory for spatial processes\, (b) the statistical theory surrounding the detection and attribution of anthropogenic signals in both observed and model-generated climate data. \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n \n
URL:https://stat-or.unc.edu/event/stor-colloquium-richard-smith-unc-ch/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20180926T153000
DTEND;TZID=America/New_York:20180926T163000
DTSTAMP:20210515T191025
CREATED:20180827T143715Z
LAST-MODIFIED:20180912T180326Z
UID:3819-1537975800-1537979400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Tingting Zhang\, University of Virginia
DESCRIPTION:Tingting Zhang\nUniversity of Virginia \n\nA Bayesian Stochastic-Blockmodel-based Approach for Mapping Epileptic Brain Networks \n \nThe human brain is a dynamic system consisting of many consistently interacting regions. The brain regions and the influences exerted by each region over another\, called directional connectivity\, form a directional network. We study normal and abnormal directional brain networks of epileptic patients using their intracranial EEG (iEEG) data\, which are multivariate time series recordings of many small brain regions. We propose a high-dimensional state-space multivariate autoregression model (SSMAR) for iEEG data. To characterize brain networks with a commonly reported cluster structure\, we use a stochastic-block-model-motivated prior for possible network patterns in the SSMAR. We develop a Bayesian framework to estimate the proposed high-dimensional model\, examine the probabilities of nonzero directional connectivity among every pair of regions\, identify clusters of densely-connected brain regions\, and map epileptic patients’ brain networks in different seizure stages. We show through both simulation and real data analysis that the new method outperforms existing network methods by being flexible to characterize various high-dimensional network patterns and robust to violation of model assumptions\, low iEEG sampling frequency\, and data noise. Applying the developed SSMAR and Bayesian approach to an epileptic patient’s iEEG data\, we reveal the patient’s network changes at the seizure onset and the unique connectivity of the seizure onset zone (SOZ)\, where seizures start and spread to other normal regions. Using this network result\, our method has a potential to assist clinicians to localize the SOZ\, a long standing research focus in epilepsy diagnosis and treatment. \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n
URL:https://stat-or.unc.edu/event/stor-colloquium-tingting-zhang-university-of-virginia/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20181001T153000
DTEND;TZID=America/New_York:20181001T163000
DTSTAMP:20210515T191025
CREATED:20180827T143807Z
LAST-MODIFIED:20180918T172158Z
UID:3821-1538407800-1538411400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Vinayak Deshpande\, UNC Kenan Flagler
DESCRIPTION:Vinayak Deshpande \nKenan Flagler Business School\, \nUniversity of North Carolina at Chapel Hill \n \nData Driven Research: Understanding and Improving Airline Flight Schedules using BTS data \n \nThe last decade has seen an explosion of operational data that is now available to researchers. In this talk\, I will share my experience in conducting research with large datasets made publicly available by the Bureau of Transportation Statistics (BTS). These data sets include flight schedule data\, FAA operations and performance data\, DOT’s domestic airline fares consumer report\, and the T-100 domestic market data. These data sets provide granular information on airline operations in the United States. My prior research has used this data to model the intrinsic uncertainty in the travel time of any commercially scheduled domestic flight in the United States\, as well as built models of how this uncertainty propagates in airline networks. These data driven models can be used to understand airline flight schedules through a descriptive lens\, as well as in building prescriptive models for improving airline flight schedules. I will summarize my findings on airline flight schedules in this talk\, as well as discuss potential research opportunities that can use this data. \n Links to relevant papers: \nReliable Air Travel Infrastructure\n \nAirline Flight Delays \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-vinayak-deshpande-unc-kenan-flagler/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20181008T153000
DTEND;TZID=America/New_York:20181008T163000
DTSTAMP:20210515T191025
CREATED:20180827T143857Z
LAST-MODIFIED:20180912T181939Z
UID:3823-1539012600-1539016200@stat-or.unc.edu
SUMMARY:STOR Colloquium: Rong Ge\, Duke University
DESCRIPTION:Rong Ge\nDuke University \n \nOptimization Landscape for Matrix Completion \nMatrix completion is a popular approach for recommendation systems. In theory\, it can be solved using complicated convex relaxations\, while in practice even simple algorithms such as stochastic gradient descent can always converge to the optimal solution. In this talk we will see some new results on the optimization landscape for the natural non-convex objective of matrix completion. In particular\, we will show that although the natural objective is non-convex and has many saddle points\, all of its local minima are equivalent to the global optimal solution. We will also discuss why such properties allow simple algorithms such as stochastic gradient descent to converge efficiently from an arbitrary initial point. \n \n \n \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n
URL:https://stat-or.unc.edu/event/stor-colloquium-rong-ge-duke/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20181015T153000
DTEND;TZID=America/New_York:20181015T163000
DTSTAMP:20210515T191025
CREATED:20180827T143945Z
LAST-MODIFIED:20181010T134944Z
UID:3825-1539617400-1539621000@stat-or.unc.edu
SUMMARY:STOR Colloquium: Cynthia Rudin\, Duke
DESCRIPTION:Cynthia Rudin \nDuke University \n \nNew Algorithms for Interpretable Machine Learning in High Stakes Decisions \n \nWith widespread use of machine learning\, there have been serious societal consequences from using black box models for high-stakes decisions\, including flawed models for medical imaging\, and poor bail and parole decisions in criminal justice. Explanations for black box models are not reliable\, and can be misleading. If we use interpretable models\, they come with their own explanations\, which are faithful to what the model actually computes. I will present work on (i) optimal decision lists\, (ii) interpretable neural networks for computer vision\, and (iii) optimal scoring systems (sparse linear models with integer coefficients). In our applications\, we have always been able to achieve interpretable models with the same accuracy as black box models. \n \nbio: Cynthia Rudin is an associate professor of computer science\, electrical and computer engineering\, and statistics at Duke University\, and directs the Prediction Analysis Lab. Her interests are in machine learning\, data mining\, applied statistics\, and knowledge discovery (Big Data)\, particularly interpretable machine learning. Her application areas are in energy grid reliability\, healthcare\, and computational criminology. Previously\, Prof. Rudin held positions at MIT\, Columbia\, and NYU. She holds an undergraduate degree from the University at Buffalo\, and a PhD in applied and computational mathematics from Princeton University. She is the recipient of the 2013 and 2016 INFORMS Innovative Applications in Analytics Awards\, an NSF CAREER award\, was named as one of the “Top 40 Under 40” by Poets and Quants in 2015\, and was named by Businessinsider.com as one of the 12 most impressive professors at MIT in 2015. Work from her lab has won 10 best paper awards in the last 5 years. She is past chair of the INFORMS Data Mining Section\, and is currently chair of the Statistical Learning and Data Science section of the American Statistical Association. She also serves on (or has served on) committees for DARPA\, the National Institute of Justice\, the National Academy of Sciences (for both statistics and criminology/law)\, and AAAI. \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n
URL:https://stat-or.unc.edu/event/stor-colloquium-cynthia-rudin-duke/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20181022T153000
DTEND;TZID=America/New_York:20181022T163000
DTSTAMP:20210515T191025
CREATED:20180906T141601Z
LAST-MODIFIED:20180906T141601Z
UID:3846-1540222200-1540225800@stat-or.unc.edu
SUMMARY:STOR Colloquium: Robert Lund\, Clemson University
DESCRIPTION:STOR Colloquium \n\nRobert Lund\nDepartment of Mathematical Sciences \nClemson University \n\nMultiple Breakpoint Detection: Mixing Documented and Undocumented Changepoints \n|This talk presents methods to estimate the number of changepoint time(s) and their locations in time-ordered data sequences when prior information is known about some of the changepoint times. A Bayesian version of a penalized likelihood objective function is developed from minimum description length (MDL) information theory principles. Optimizing the objective function yields estimates of the changepoint number(s) and location time(s). Our MDL penalty depends on where the changepoint(s) lie\, but not solely on the total number of changepoints (such as classical AIC and BIC penalties). Specifically\, configurations with changepoints that occur relatively closely to one and other are penalized more heavily than sparsely arranged changepoints. The techniques allow for autocorrelation in the observations and mean shifts at each changepoint time. This scenario arises in climate time series where a “metadata” record exists documenting some\, but not necessarily all\, of station move times and instrumentation changes. Applications to climate time series are presented throughout. \n \n Refreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n
URL:https://stat-or.unc.edu/event/stor-colloquium-robert-lund-clemson-university/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20181105T143000
DTEND;TZID=America/New_York:20181105T153000
DTSTAMP:20210515T191025
CREATED:20181025T122345Z
LAST-MODIFIED:20181025T122345Z
UID:3894-1541428200-1541431800@stat-or.unc.edu
SUMMARY:STOR Colloquium: Sven Leyffer\, Argonne National Laboratory
DESCRIPTION:Mixed-Integer PDE-Constrained Optimization \n \nMany complex applications can be formulated as optimization problems constrained by partial differential equations (PDEs) with integer decision variables. This new class of problems\, called mixed-integer PDE-constrained optimization (MIPDECO)\, must overcome the combinatorial challenge of integer decision variables combined with the numerical and computational complexity of PDE-constrained optimization. Examples of MIPDECOs include the remediation of contaminated sites and the maximization of oil recovery; the design of next-generation solar cells; the layout design of wind-farms; the design and control of gas networks; disaster recovery; and topology optimization. \n \nWe will present some emerging applications of mixed-integer PDE-constrained optimization\, review existing approaches to solve these problems\, and \nhighlight their computational and mathematical challenges. We show how existing methods for solving mixed-integer optimization problems can be adapted to solve this new class of problems.
URL:https://stat-or.unc.edu/event/stor-colloquium-sven-leyffer-argonne-national-laboratory/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20181107T143000
DTEND;TZID=America/New_York:20181107T153000
DTSTAMP:20210515T191025
CREATED:20181010T134459Z
LAST-MODIFIED:20181010T134459Z
UID:3876-1541601000-1541604600@stat-or.unc.edu
SUMMARY:STOR Colloquium: Quefeng Li\, UNC-Chapel Hill
DESCRIPTION:Quefeng Li \nDepartment of Biostatistics\nUNC-Chapel Hill \n \n \nIntegrative linear discriminant analysis with guaranteed error rate improvement \n \nNumerous empirical studies have found that integrative analysis of multimodal data can result in better statistical performance. However\, little theory is known on when and why including more variables in a statistical model can improve the prediction. In the context of two-class classification\, we provide a theoretical guarantee that running an integrative linear discriminant analysis on multimodal data achieves smaller misclassification error than running linear discriminant analysis on each individual data type. We explicitly characterize the trade-off between the extra information brought by multimodal data and the extra estimation error they bring. We also demonstrate that such a guarantee also applies to some other classifiers. \n \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n
URL:https://stat-or.unc.edu/event/stor-colloquium-quefeng-li-unc-chapel-hill/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20181112T143000
DTEND;TZID=America/New_York:20181112T153000
DTSTAMP:20210515T191025
CREATED:20180827T144038Z
LAST-MODIFIED:20181010T134552Z
UID:3827-1542033000-1542036600@stat-or.unc.edu
SUMMARY:STOR Colloquium: Xi Chen\, NYU
DESCRIPTION:Xi Chen\nNew York University \n \nStatistical Inference for Model Parameters with Stochastic Gradient Descent \n \nIn this talk\, we investigate the problem of statistical inference of the true model parameters based on stochastic gradient descent (SGD) with Ruppert-Polyak averaging. To this end\, we propose a consistent estimator of the asymptotic covariance of the average iterate from SGD — batch-means estimator\, which only uses the iterates from SGD. As the SGD process forms a time-inhomogeneous Markov chain\, our batch-means estimator with carefully chosen increasing batch sizes generalizes the classical batch-means estimator designed for time-homogenous Markov chains. The proposed batch-means estimator allows us to construct asymptotically exact confidence intervals and hypothesis tests. We further discuss an extension to conducting inference based on SGD for high-dimensional linear regression. \n \nBio: Xi Chen is an assistant professor at Stern School of Business at New York University. Before that\, he was a Postdoc in the group of Prof. Michael Jordan at UC Berkeley. He obtained his Ph.D. from the Machine Learning Department at Carnegie Mellon University. He studies high-dimensional statistics\, multi-armed bandits\, and stochastic optimization. He received Simons-Berkeley Research Fellowship\, Google Faculty Award\, Adobe Data Science Award\, Bloomberg research award\, and was featured in 2017 Forbes list of “30 Under30 in Science”. \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-xi-chen-nyu/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20181119T143000
DTEND;TZID=America/New_York:20181119T153000
DTSTAMP:20210515T191025
CREATED:20180827T144239Z
LAST-MODIFIED:20181116T201525Z
UID:3829-1542637800-1542641400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Tailen Hsing\, UMich
DESCRIPTION:Tailen Hsing\nDepartment of Statistics \nUniversity of Michigan \n \nModeling and inference of local stationarity \n \nStationarity is a common assumption in spatial statistics. The justification is often that stationarity is a reasonable approximation to the true state of dependence if we focus on spatial data locally. In this talk\, we first review various known approaches for modeling nonstationary spatial data. We then examine the notion of local stationarity in more detail. To illustrate\, we focus on the multi-fractional Brownian motion\, for which a thorough analysis could be conducted assuming data are observed on a regular grid. A theoretical lower bound for the minimax risk of this inference problem is established for a wide class of smooth Hurst functions. We also propose a new nonparametric estimator and show that it is rate optimal. Implementation issues of the estimator including how to overcome the presence of a nuisance parameter and choose the tuning parameter from data will be considered. Finally\, extensions to more general settings that relate to Matheron’s intrinsic random functions will be briefly discussed. \n \n \n \n \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n
URL:https://stat-or.unc.edu/event/stor-colloquium-tailen-hsing-umich/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20181128T143000
DTEND;TZID=America/New_York:20181128T153000
DTSTAMP:20210515T191025
CREATED:20180827T144351Z
LAST-MODIFIED:20181119T170729Z
UID:3831-1543415400-1543419000@stat-or.unc.edu
SUMMARY:STOR Colloquium: Weijie Su\, UPenn
DESCRIPTION:STOR Colloquium \nWednesday\, November 28th\, 2018 \n120 Hanes Hall \n3:30pm\n\nWeijie Su\nUniversity of Pennsylvania \nUncertainty Quantification for Stochastic Gradient Descent \n \nStochastic gradient descent (SGD) is an immensely popular approach for online learning in settings where data arrives in a stream or data sizes are very large. However\, despite an ever-increasing volume of work on SGD\, much less is known about the statistical inferential properties of SGD-based predictions. Taking a fully inferential viewpoint\, this talk introduces a novel procedure termed HiGrad to conduct statistical inference for online learning\, without incurring additional computational cost compared with SGD. The HiGrad procedure begins by performing SGD updates for a while and then splits the single thread into several threads\, and this procedure hierarchically operates in this fashion along each thread. With predictions provided by multiple threads in place\, a t-based confidence interval is constructed by decorrelating predictions using covariance structures given by a Donsker-style extension of the Ruppert–Polyak averaging scheme\, which is a technical contribution of independent interest. Under certain regularity conditions\, the HiGrad confidence interval is shown to attain asymptotically exact coverage probability. The performance of HiGrad is evaluated through extensive simulation studies and a real data example. We conclude the talk with an application of HiGrad to deep neural networks. \nThis is based on joint work with Yuancheng Zhu. \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n
URL:https://stat-or.unc.edu/event/stor-colloquium-weijie-su-upenn/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20181203T143000
DTEND;TZID=America/New_York:20181203T153000
DTSTAMP:20210515T191025
CREATED:20181119T171709Z
LAST-MODIFIED:20181119T171709Z
UID:3916-1543847400-1543851000@stat-or.unc.edu
SUMMARY:STOR Colloquium: Ana-Maria Staicu\, NC State
DESCRIPTION:STOR Colloquium \nMonday\, December 3rd\, 2018 \n120 Hanes Hall \n3:30pm\n\nAna-Maria Staicu\nNorth Carolina State University \nLongitudinal Dynamic Functional Regression \n \nIn this talk we discuss regression models to study the association between scalar outcomes and functional predictors observed over time\, at many instances\, in longitudinal studies. We propose a parsimonious modeling framework to study time-varying regression that leads to superior prediction properties and allows to reconstruct full trajectories of the response. The idea is to model the time-varying functional predictors using orthogonal basis functions and expand the time-varying regression coefficient using the same basis. Numerical investigation through simulation studies and data analysis show excellent performance in terms of accurate prediction and efficient computations\, when compared with existing alternatives. The methods are inspired and applied to an animal science application\, where of interest is to study the association between the feed intake of lactating sows and the minute-by-minute temperature throughout the 21st days of their lactation period. \n \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n
URL:https://stat-or.unc.edu/event/stor-colloquium-ana-maria-staicu-nc-state/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190111T143000
DTEND;TZID=America/New_York:20190111T153000
DTSTAMP:20210515T191025
CREATED:20190102T184849Z
LAST-MODIFIED:20190102T220845Z
UID:3951-1547217000-1547220600@stat-or.unc.edu
SUMMARY:STOR Colloquium: Bikram Karmakar\, University of Pennsylvania
DESCRIPTION:Bikram Karmakar \nStatistics Department\, The Wharton School\,\nUniversity of Pennsylvania \n \nEvidence factors for observational studies: methodology\, computation and applications. \n \nObservational studies aim to elucidate cause-and-effect relationships from situations in which treatment is not randomly assigned. A sensitivity analysis for an observational study assesses how much bias\, due to non-random assignment of treatment\, would be necessary to change the conclusions of an analysis that assumes treatment assignment was effectively random. Causal conclusions gain strength from a demonstration that they are insensitive to small or moderate violations of non-random treatment assignment\, especially if that happens in each of several statistically independent analyses that depend upon very different assumptions. In particular\, causal conclusions gain strength when evidence factors concur and are insensitive to bias\, where a study is said to contain two or more evidence factors if it provides two or more tests of the null hypothesis of no treatment effect that would be (essentially) independent were there no effect. Previous work with evidence factors has not addressed the problem that they involve multiple testing and how to control the type-I error to obtain valid inference. We develop a powerful method for controlling the familywise error rate for sensitivity analyses with evidence factors. We show that the Bahadur efficiency of sensitivity analysis for the combined evidence is greater than for any one evidence factor alone\, so that even though using two or more evidence factors requires multiple testing\, a study is better off asymptotically using two or more evidence factors than just one factor. \n \nWe also develop methods to widen the applicability of evidence factors to various designs for causal assessment\, including designs with instrumental variables\, and case-control studies. Computationally however it is often very hard to build these designs optimally. Even the simplest addition to a one treatment-control comparison – a second comparison – creates design problems without polynomial-time solutions. We develop an “approximation algorithm” that provides a solution in polynomial time that is probably not much worse than the unattainable optimal solution. \n \nWe illustrate our methodological and computational developments for evidence factors in two observational studies: (i) the effect of exposure to radiation on solid cancers and (ii) the effect of having a side airbag on the chance of dying in a car crash. \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-bikram-karmakar/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190114T143000
DTEND;TZID=America/New_York:20190114T153000
DTSTAMP:20210515T191025
CREATED:20190102T185216Z
LAST-MODIFIED:20190102T223233Z
UID:3959-1547476200-1547479800@stat-or.unc.edu
SUMMARY:STOR Colloquium: Joshua Cape\, Johns Hopkins
DESCRIPTION:Joshua Cape\nJohns Hopkins University \n \nStatistical analysis and spectral methods for signal-plus-noise matrix models \n \nEstimating eigenvectors and principal subspaces is of fundamental importance for numerous problems in statistics\, data science\, and network analysis\, including covariance matrix estimation\, principal component analysis\, and community detection. For each of these problems\, we obtain foundational results that precisely quantify the local (e.g.\, entrywise) behavior of sample eigenvectors within the context of a unified signal-plus-noise matrix framework. Our methods and results collectively address eigenvector consistency and asymptotic normality\, decompositions of high-dimensional matrices\, Procrustes analysis\, deterministic perturbation bounds\, and real-data spectral clustering applications in connectomics. \n \n \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-joshua-cape/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190116T143000
DTEND;TZID=America/New_York:20190116T153000
DTSTAMP:20210515T191025
CREATED:20190102T185436Z
LAST-MODIFIED:20190102T223102Z
UID:3965-1547649000-1547652600@stat-or.unc.edu
SUMMARY:STOR Colloquium: Yanglei Song\, University of Illinios
DESCRIPTION:Yanglei Song\nUniversity of Illinois at Urbana-Champaign \n Asymptotically optimal multiple testing with streaming data \n \nThe problem of testing multiple hypotheses with streaming (sequential) data arises in diverse applications such as multi-channel signal processing\, surveillance systems\, multi-endpoint clinical trials\, and online surveys. In this talk\, we investigate the problem under two generalized error metrics. Under the first one\, the probability of at least k mistakes\, of any kind\, is controlled. Under the second\, the probabilities of at least k1 false positives and at least k2 false negatives are simultaneously controlled. For each formulation\, we characterize the optimal expected sample size to a first-order asymptotic approximation as the error probabilities vanish\, and propose a novel procedure that is asymptotically efficient under every signal configuration. These results are established when the data streams for the various hypotheses are independent and each local log-likelihood ratio statistic satisfies a certain law of large numbers. Further\, in the special case of iid observations\, we quantify the asymptotic gains of sequential sampling over fixed-sample size schemes. \n \n \n \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-yanglei-song/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190118T143000
DTEND;TZID=America/New_York:20190118T153000
DTSTAMP:20210515T191025
CREATED:20190102T185356Z
LAST-MODIFIED:20190108T194040Z
UID:3963-1547821800-1547825400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Vince Lyzinski\, Univ. of Massachusetts Amherst
DESCRIPTION:Vince Lyzinski\nThe University of Massachusetts Amherst \n \nGraph matching in edge-independent networks \n \nThe graph matching problem seeks to find an alignment between the vertex sets of two graphs that best preserves common structure across graphs. Here\, we consider the closely related problem of graph matchability: Given a latent alignment between the vertex sets of two graphs\, under what conditions will the solution to the graph matching optimization problem recover this alignment in the presence of shuffled vertex labels? We consider the problem of graph matchability in non-identically distributed networks\, and working in a general class of edge-independent network models\, we demonstrate that graph matchability is almost surely lost when matching the networks directly\, and is almost perfectly recovered when first centering the networks using Universal Singular Value Thresholding before matching. While there are currently no efficient algorithms for solving the graph matching problem in general\, these results nonetheless provide practical algorithmic guidance for approximately matching networks in both real and synthetic data applications. \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-vince-lyzinski/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190123T143000
DTEND;TZID=America/New_York:20190123T153000
DTSTAMP:20210515T191025
CREATED:20190102T185137Z
LAST-MODIFIED:20190116T144101Z
UID:3957-1548253800-1548257400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Jingshen Wang\, University of Michigan
DESCRIPTION:Jingshen Wang\n University of Michigan \n \nInference on Treatment Effects after Model Selection \n \nInferring cause-effect relationships between variables is of primary importance in many sciences. In this talk\, I will discuss two approaches for making valid inference on treatment effects when a large number of covariates are present. The first approach is to perform model selection and then to deliver inference based on the selected model. If the inference is made ignoring the randomness of the model selection process\, then there could be severe biases in estimating the parameters of interest. While the estimation bias in an under-fitted model is well understood\, I will address a lesser known bias that arises from an over-fitted model. The over-fitting bias can be eliminated through data splitting at the cost of statistical efficiency\, and I will propose a repeated data splitting approach to mitigate the efficiency loss. The second approach concerns the existing methods for debiased inference. I will show that the debiasing approach is an extension of OLS to high dimensions\, and that a careful bias analysis leads to an improvement to further control the bias. The comparison between these two approaches provides insights into their intrinsic bias-variance trade-off\, and I will show that the debiasing approach may lose efficiency in observational studies. \n \nThis is joint work with Xuming He and Gongjun Xu. \n \n \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-jingshen-wang/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190125T143000
DTEND;TZID=America/New_York:20190125T153000
DTSTAMP:20210515T191025
CREATED:20190102T184949Z
LAST-MODIFIED:20190117T205204Z
UID:3953-1548426600-1548430200@stat-or.unc.edu
SUMMARY:STOR Colloquium: Geoffrey Schiebinger\, MIT and Harvard
DESCRIPTION:Geoffrey Schiebinger \nThe Broad Institute of MIT and Harvard and \nthe MIT Statistics and Data Science Center \n \nTowards a mathematical theory of development \n \nIn this talk we introduce a mathematical model to describe temporal processes like embryonic development and cellular reprogramming. We consider stochastic processes in gene expression space to represent developing populations of cells\, and we use optimal transport to recover the temporal couplings of the process. We apply these ideas to study 315\,000 single-cell RNA-sequencing profiles collected at 40 time points over 18 days of reprogramming fibroblasts into induced pluripotent stem cells. To validate the optimal transport model\, we demonstrate that it can accurately predict developmental states at held-out time points. We construct a high-resolution map of reprogramming that rediscovers known features; uncovers new alternative cell fates including neural- and placental-like cells; predicts the origin and fate of any cell class; and implicates regulatory models in particular trajectories. Of these findings\, we highlight the transcription factor Obox6 and the paracrine signaling factor GDF9\, which we experimentally show enhance reprogramming efficiency. Our approach provides a general framework for investigating cellular differentiation\, and poses some interesting questions in theoretical statistics. \n \nBio: Geoffrey Schiebinger is a postdoctoral fellow in the MIT Center for Statistics and the Klarman Cell Observatory at the Broad Institute of MIT and Harvard. His postdoctoral mentors are Eric Lander\, Aviv Regev\, and Philippe Rigollet. Geoffrey has won numerous academic awards including a Career Award at the Scientific Interface from the Burroughs Welcome Fund\, and a prize for Best Contribution to the conference Statistical Challenges in Single Cell Analysis organized by ETH Zurich. Before coming to MIT\, Geoff studied Statistics at UC Berkeley\, where he earned his Ph.D. in May 2016 for a doctoral thesis on the Mathematics of Precision Measurement. He is fortunate to have been advised by Benjamin Recht\, and also work with Martin Wainwright\, Bin Yu\, and Aditya Guntuboyina. Geoffrey attended Stanford University from 2007 – 2011 for his undergraduate degree in mathematics with a minor in physics and master’s degree in electrical engineering. \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-geoffrey-schiebinger/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190128T143000
DTEND;TZID=America/New_York:20190128T153000
DTSTAMP:20210515T191025
CREATED:20190102T185044Z
LAST-MODIFIED:20190116T144345Z
UID:3955-1548685800-1548689400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Jeffrey Regier\, Univ. of California\, Berkeley
DESCRIPTION:Jeffrey Regier\nUniversity of California\, Berkeley \n\nStatistical Inference for Cataloging the Visible Universe \n \nA key task in astronomy is to locate astronomical objects in images and to characterize them according to physical parameters such as brightness\, color\, and morphology. This task\, known as cataloging\, is challenging for several reasons: many astronomical objects are much dimmer than the sky background\, labeled data is generally unavailable\, overlapping astronomical objects must be resolved collectively\, and the datasets are enormous — terabytes now\, petabytes soon. Existing approaches to cataloging are largely based on algorithmic software pipelines that lack an explicit inferential basis. In this talk\, present a new approach to cataloging based on inference in a fully specified probabilistic model. consider two inference procedures: one based on variational inference (VI) and another based on MCMC. A distributed implementation of VI\, written in Julia and run on a supercomputer\, achieves petascale performance — a first for any high-productivity programming language. The run is the largest-scale application of Bayesian inference reported to date. In an extension\, using new ideas from variational autoencoders and deep learning\, I avoid many of the traditional disadvantages of VI relative to MCMC\, and improve model fit. \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n
URL:https://stat-or.unc.edu/event/stor-colloquium-jeffrey-regier/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190130T143000
DTEND;TZID=America/New_York:20190130T153000
DTSTAMP:20210515T191025
CREATED:20190102T185259Z
LAST-MODIFIED:20190116T144153Z
UID:3961-1548858600-1548862200@stat-or.unc.edu
SUMMARY:STOR Colloquium: Pragya Sur\, Stanford University
DESCRIPTION:Pragya Sur\n Stanford University \n \nA modern maximum-likelihood approach for \nhigh-dimensional logistic regression \n \nLogistic regression is arguably the most widely used and studied non-linear model in statistics. Classical maximum-likelihood theory based statistical inference is ubiquitous in this context. This theory hinges on well-known fundamental results: (1) the maximum-likelihood-estimate (MLE) is asymptotically unbiased and normally distributed\, (2) its variability can be quantified via the inverse Fisher information\, and (3) the likelihood-ratio-test (LRT) is asymptotically a Chi-Squared. In this talk\, I will show that in the common modern setting where the number of features and the sample size are both large and comparable\, classical results are far from accurate. In fact\, (1) the MLE is biased\, (2) its variability is far greater than classical results\, and (3) the LRT is not distributed as a Chi-Square. Consequently\, p-values obtained based on classical theory are completely invalid in high dimensions. In turn\, I will propose a new theory that characterizes the asymptotic behavior of both the MLE and the LRT under some assumptions on the covariate distribution\, in a high-dimensional setting. Empirical evidence demonstrates that this asymptotic theory provides accurate inference in finite samples. Practical implementation of these results necessitates the estimation of a single scalar\, the overall signal strength\, and I will propose a procedure for estimating this parameter precisely. \nThis is based on joint work with Emmanuel Candes and Yuxin Chen. \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-pragya-sur/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190215T143000
DTEND;TZID=America/New_York:20190215T153000
DTSTAMP:20210515T191025
CREATED:20190201T212611Z
LAST-MODIFIED:20190208T164533Z
UID:3998-1550241000-1550244600@stat-or.unc.edu
SUMMARY:STOR Colloquium: Elina Robeva\, MIT
DESCRIPTION:Elina Robeva\nMassachusetts Institute of Technology \n\nMaximum likelihood estimation under total positivity \n \nNonparametric density estimation is a challenging statistical problem — in general the maximum likelihood estimate (MLE) does not even exist! Introducing shape constraints allows a path forward. In this talk I will discuss non-parametric density estimation under total positivity (i.e. log-supermodularity). Though they possess very special structure\, totally positive random variables are quite common in real world data and exhibit appealing mathematical properties. Given i.i.d. samples from a totally positive and log-concave distribution\, we prove that the MLE exists with probability one if there are at least 3 samples. We characterize the domain of the MLE\, and give algorithms to compute it. If the observations are 2-dimensional or binary\, we show that the logarithm of the MLE is a piecewise linear function and can be computed via a certain convex program. Finally\, I will discuss statistical guarantees for the convergence of the MLE\, and will conclude with a variety of further research directions. \n
URL:https://stat-or.unc.edu/event/graduate-student-seminar-haipeng-gao/
LOCATION:Hanes 120
CATEGORIES:Graduate Seminar,STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190220T143000
DTEND;TZID=America/New_York:20190220T153000
DTSTAMP:20210515T191025
CREATED:20190218T171003Z
LAST-MODIFIED:20190218T171003Z
UID:4021-1550673000-1550676600@stat-or.unc.edu
SUMMARY:STOR Colloquium: Abishek Sankararaman\, UT-Austin
DESCRIPTION:Abishek Sankararaman \nThe University of Texas at Austin \n \nInterference Queuing Networks and \nSpatial Birth-Death Processes \n \nMotivated by applications in wireless networks\, we consider a class of spatial birth death process\, both on the continuum and on the discrete space of grids. In the continuum\, we consider an interacting particle birth death process on a compact torus and study questions of stochastic stability. In the discrete setting\, we consider a countably infinite collection of coupled queues representing a large wireless network in the Euclidean space. There is one queue at each point of the d -dimensional integer grid. These queues have independent Poisson arrivals\, but are coupled through their service rates. The service discipline is of the processor sharing type\, with the service rate in each queue slowed down\, when the neighboring queues have a larger workload. More precisely\, the service rate is the signal to interference ratio of wireless network theory. This coupling is parameterized by a symmetric and translation invariant interference sequence. The dynamics is infinite dimensional Markov\, with each queue having a non compact state space. It is neither reversible nor asymptotically product form\, as in the mean-field setting. Coupling and percolation techniques are first used to show that this dynamics has well defined trajectories. Coupling from the past techniques of the Loynes’ type are then proposed to build its minimal stationary regime. This regime is the one obtained when starting from the all empty initial condition in the distant past. The rate conservation principle of Palm calculus is then used to identify the stability condition of this system\, namely the condition on the interference sequence and arrival rates guaranteeing the finiteness of this minimal regime. We show that the identified condition is also necessary in certain special cases and conjecture to be true in all cases. Remarkably\, the rate conservation principle also provides a closed form expression for its mean queue size. When the stability condition holds\, this minimal solution is the unique stationary regime\, provided it has finite second moments\, and this is the case if the arrival rate is small enough. In addition\, there exists a range of small initial conditions for which the dynamics is attracted to the minimal regime. Surprisingly however\, there exists another range of larger though finite initial conditions for which the dynamics diverges\, even though stability criterion holds. \n \nBased on papers – https://arxiv.org/pdf/1604.07884.pdf and https://arxiv.org/pdf/1710.09797.pdf.
URL:https://stat-or.unc.edu/event/stor-colloquium-abishek-sankararaman-ut-austin/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190301T143000
DTEND;TZID=America/New_York:20190301T153000
DTSTAMP:20210515T191025
CREATED:20190226T161801Z
LAST-MODIFIED:20190226T161801Z
UID:4026-1551450600-1551454200@stat-or.unc.edu
SUMMARY:STOR Colloquium: Tucker McElroy\, US Census Bureau
DESCRIPTION:Tucker McElroy\nU.S. Census Bureau \nCasting Vector Time Series: Forecasting\, Imputation\, and Signal Extraction In the Context of Big Data \n \nRecursive algorithms\, based upon the nested structure of Toeplitz covariance matrices arising from stationary processes\, are presented for the efficient computation of multi-step ahead forecast error covariances for nonstationary vector time series. Further\, we discuss time reversal to forecast the past\, and algorithms for imputation of missing values. These quantities are required to quantify multi-step ahead forecast error and signal extraction error. The methods are applied to daily retail data exhibiting trend dynamics and seasonality. \n \nShort bio: Dr. McElroy is Senior Time Series Mathematical Statistician at the U.S. Census Bureau\, where he has served the last 15 years as a researcher and consultant on seasonal adjustment problems. He has a B.A. from Columbia University (1996)\, and a Ph.D. in mathematics from University of California\, San Diego (2001).
URL:https://stat-or.unc.edu/event/stor-colloquium-tucker-mcelroy-us-census-bureau/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190304T143000
DTEND;TZID=America/New_York:20190304T153000
DTSTAMP:20210515T191025
CREATED:20190226T172921Z
LAST-MODIFIED:20190226T172921Z
UID:4028-1551709800-1551713400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Jonathan M. Lees\, UNC-Chapel Hill
DESCRIPTION:Jonathan M. Lees \nUniversity of North Carolina at Chapel Hill \n \nGeophysical Time Series Analysis on Volcanoes: \nCan we quantify non-linearity? \n\nMost geophysical processes are aperiodic noisy\, intermittent and transient. This requires specialized methods for time series analysis\, that seek patterns in time series that vary in space and time. I present here examples from research on exploding volcanoes that exhibit tremor that appears to be resonant but likely results from nonlinear feedback systems. The physical models for these observations are highly controversial\, so detailed analysis of patterns observed in the time series represents a significant challenge for understanding volcano dynamics. Seismic and Acoustic waves will be presented at a number of volcanoes\, including Karymsky (Kamchatka\, Russia)\, Santiaguito (Ecuador)\, Tungurahua (Ecuador) and others. Methods involving time-domain and frequency domain analysis\, wavelet transforms\, Hilbert-Huang empirical mode approaches and Bispectral decomposition will be illustrated.
URL:https://stat-or.unc.edu/event/stor-colloquium-jonathan-m-lees-unc-chapel-hill/
LOCATION:Hanes Hall
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190320T153000
DTEND;TZID=America/New_York:20190320T163000
DTSTAMP:20210515T191025
CREATED:20190220T221249Z
LAST-MODIFIED:20190220T221249Z
UID:4023-1553095800-1553099400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Yuan Liao\, Rutgers University
DESCRIPTION:Yuan Liao \nRutgers University \n \n \nFactor-Driven Two-Regime Regression using \nMixed Integer Programming \n \nWe propose a two-regime regression model where the switching between the regimes is driven by a vector of possibly unobservable factors. When the factors are latent\, we estimate them by the principal component analysis of a much larger data set. We show that the optimization problem can be reformulated as mixed integer optimization and present two alternative computational algorithms: (1) MI quadratic programming and (2) MI linear programming. We show that (1) is numerically equivalent to the original least squares problem\, but runs slowly. On the other hand\, (2) runs much faster\, and produces asymptotically equivalent estimators. We derive the asymptotic distributions of the resulting estimators\, and establish a phase transition that describes the effect of first stage factor estimation. \n \nThe paper can be downloaded from https://arxiv.org/abs/1810.11109
URL:https://stat-or.unc.edu/event/stor-colloquium-yuan-liao-rutgers-university/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190401T153000
DTEND;TZID=America/New_York:20190401T163000
DTSTAMP:20210515T191025
CREATED:20190312T134500Z
LAST-MODIFIED:20190312T192835Z
UID:4066-1554132600-1554136200@stat-or.unc.edu
SUMMARY:STOR Colloquium: Serhan Ziya\, UNC-Chapel Hill
DESCRIPTION:Serhan Ziya\nUniversity of North Carolina at Chapel Hill \n \nService operations with a focus in healthcare: a partial and \nsubjective overview of related research in STOR \n \nThis talk will provide an overview of some of the research projects in which the speaker is either currently an active participant or hopes to help initiate in the near future. The main goal is to create awareness and generate interest in the area of service operations particularly as it relates to healthcare\, and highlight opportunities for potential collaborations with different research groups within or outside STOR. The first half of the talk will be devoted to a relatively detailed presentation of an interdisciplinary research project on improving patient flow in emergency departments while the second half will consist of quick exposition of several other projects\, which have features that are likely be of interest to researchers with various methodological backgrounds. \n \nCollaborators on the patient flow project are Nilay Tanik Argon\, Tommy Bohrmann\, Wanyi Chen\, Benjamin Linthicum\, Kenny Lopiano\, Abhi Mehrotra\, and Debbie Travers. Collaborators on other projects will be introduced during the talk.
URL:https://stat-or.unc.edu/event/stor-colloquium-serhan-ziya-unc-chapel-hill/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190923T153000
DTEND;TZID=America/New_York:20190923T163000
DTSTAMP:20210515T191025
CREATED:20190830T192333Z
LAST-MODIFIED:20190830T192333Z
UID:4356-1569252600-1569256200@stat-or.unc.edu
SUMMARY:Colloquium: Peter J. Mucha\, UNC-Chapel Hill
DESCRIPTION:Peter J. Mucha \nThe University of North Carolina at Chapel Hill \nDepartment of Mathematics \nCommunities in Multilayer Networks \nCommunity detection describes the organization of a network in terms of patterns of connection\, identifying tightly connected structures known as communities. A wide variety of methods for community detection have been proposed\, with a number of software packages available for performing community detection. In the past decade\, there has been increased interest in multilayer networks\, a general framework that can be used to describe networks with multiple types of relationships\, that change in time\, or that network together multiple kinds of networks. We describe various generalizations of community detection to multilayer networks\, including results about detectability limits and a new post-processing procedure to explore the parameter space of multilayer modularity\, along with pointers to using community detection in applications.
URL:https://stat-or.unc.edu/event/colloquium-peter-j-mucha-unc-chapel-hill/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20191004T153000
DTEND;TZID=America/New_York:20191004T163000
DTSTAMP:20210515T191025
CREATED:20190926T175045Z
LAST-MODIFIED:20190930T151845Z
UID:4418-1570203000-1570206600@stat-or.unc.edu
SUMMARY:STOR Colloquium: Tong Wang\, University of Iowa
DESCRIPTION:Dr. Tong Wang \nTippie College of Business \nUniversity of Iowa \n \nHybrid Predictive Model: When an Interpretable Model Collaborates with a Black-box Model \n \nInterpretable machine learning has received increasing interest in recent years\, especially in domains where humans are involved in the decision-making process. However\, the possible loss of the task performance for gaining interpretability is often inevitable\, especially for large datasets or complicated tasks. This performance downgrade puts practitioners in a dilemma of choosing between a top-performing black-box model with no explanations and an interpretable model with unsatisfying task performance. In this work\, we propose a novel framework for building a Hybrid Predictive Model (HPM) that integrates an interpretable model with any black-box model to introduce interpretability in the decision-making process at no or low cost of the predictive accuracy. The interpretable model substitutes the black-box model on a subset of data where the black-box model is overkill or nearly overkill. We design a principled objective function that considers predictive accuracy\, model interpretability\, and model transparency\, which is the percentage of data processed by the interpretable model. This framework brings together the advantages of the high predictive performance of black-box models and the high interpretability of interpretable models. We instantiate the proposed framework with two types of models\, one using decision rules as the interpretable collaborator and one using linear models. For both models\, we develop customized training algorithms with theoretically grounded bounds to reduce computation. We test the hybrid predictive models on structured datasets and text data. In these experiments\, the interpretable models collaborate with state-of-the-art black-box models including ensemble models and neural networks. We propose to use efficient frontiers to characterize the trade-off between transparency and predictive performance. Results show that hybrid models are able to obtain transparency at no or low cost of predictive performance. \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall \n
URL:https://stat-or.unc.edu/event/stor-colloquium-tong-wang-university-of-iowa/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20191014T153000
DTEND;TZID=America/New_York:20191014T163000
DTSTAMP:20210515T191025
CREATED:20190911T120737Z
LAST-MODIFIED:20191009T184019Z
UID:4386-1571067000-1571070600@stat-or.unc.edu
SUMMARY:STOR Colloquium: Heping Zhang\, Yale
DESCRIPTION:Back to the Basics: Residuals and Diagnostics for Generalized Linear Models\nHeping Zhang \nSusan Dwight Bliss Professor of Biostatistics \nYale University School of Public Health \nOrdinal outcomes are common in scientific research and everyday practice\, and we often rely on regression models to make inference. A long-standing problem with such regression analyses is the lack of effective diagnostic tools for validating model assumptions. The difficulty arises from the fact that an ordinal variable has discrete values that are labeled with\, but not\, numerical values. The values merely represent ordered categories. In this paper\, we propose a surrogate approach to defining residuals for an ordinal outcome Y. The idea is to define a continuous variable S as a “surrogate” of Y and then obtain residuals based on S. For the general class of cumulative link regression models\, we study the residual’s theoretical and graphical properties. We show that the residual has null properties similar to those of the common residuals for continuous outcomes. Our numerical studies demonstrate that the residual has power to detect misspecification with respect to 1) mean structures; 2) link functions; 3) heteroscedasticity; 4) proportionality; and 5) mixed populations. The proposed residual also enables us to develop numeric measures for goodness-of-fit using classical distance notions. Our results suggest that compared to a previously defined residual\, our residual can reveal deeper insights into model diagnostics. We stress that this work focuses on residual analysis\, rather than hypothesis testing. The latter has limited utility as it only provides a single p-value\, whereas our residual can reveal what components of the model are misspecified and advise how to make improvements. \nThis is a joint work with Dungang Liu\, University of Cincinnati Lindner College of Business. \nThe entire article can be viewed here.
URL:https://stat-or.unc.edu/event/stor-colloquium-heping-zhang-yale/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20191021T153000
DTEND;TZID=America/New_York:20191021T163000
DTSTAMP:20210515T191025
CREATED:20190926T175335Z
LAST-MODIFIED:20191015T195648Z
UID:4422-1571671800-1571675400@stat-or.unc.edu
SUMMARY:STOR Colloquium: Shujie Ma\, UC Riverside
DESCRIPTION:Shujie Ma \nUniversity of California\, Riverside \n \nHow many communities are there in a network? \n \nAdvances in modern technology have facilitated the collection of network data which emerge in many fields including biology\, bioinformatics\, physics\, economics\, sociology and so forth. Network data often have natural communities which are groups of interacting objects (i.e.\, nodes); pairs of nodes in the same group tend to interact more than pairs belonging to different groups. Community detection then becomes a very important task\, allowing us to identify and understand the structure of a network. Thus\, the development of methods for community detection has attracted much attention in the past decade\, and as a result\, different efficient approaches have been proposed in literature. \n \nA fundamental limitation of most existing methods is that they divide networks into a fixed number of communities\, i.e.\, the number of communities is known and given in advance. However\, in practice\, such prior information is typically unavailable. Determining the number of communities is a challenging yet important task\, as the following community detection procedure relies upon it. In this talk\, I will introduce a convenient and effective solution to this problem under the degree-corrected stochastic block models (DC-SBM). The proposed method takes advantages of spectral clustering\, likelihood principle and binary segmentation. Determining the number of communities is essentially a model selection problem\, and we therefore establish the selection consistency of our proposed procedure under a mild condition on the average degree. We demonstrate the approach on different networks. At the end of my talk\, I will briefly talk about our other on-going and future research projects in this line of work. \n \n \n \nRefreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall
URL:https://stat-or.unc.edu/event/stor-colloquium-shujie-ma-uc-riverside/
LOCATION:Hanes 120
CATEGORIES:STOR Colloquium
END:VEVENT
END:VCALENDAR