Data science & machine learning in production机器学习模型可以增企业的几乎所有方面,从市场营销到销售再到维护。在生产制造业,物联网的兴起及其带来的前所未有的海量数据,为利用机器学习带来了无数机会。根据《全球市场观察》的一份报告,全球制造业机器学习将从2018年的10亿美元飙升至2025年的160亿美元。除此之外,还需要不断降低成本,促进工业4.0技术的应用。具体来说,在预测性维护、质量控制、物流及存货管理等领域,深度学习都有了广泛的应用。
We show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches
We have described a new model, BusTr, for predicting how long it will take public transit buses to travel between points on their routes based on contextual features such as location and time as well as estimates of current tra c conditions
We show in the offline evaluations the complete evolution of our embeddings, each annotated with version number and its retrieval metric, and the results of selected versions in end-to-end human relevance evaluations and engagement A/B experiments
We describe our journey in tackling the problem of diversity for Airbnb search, starting from heuristic based approaches and concluding with a novel deep learning solution that produces an embedding of the entire query context by leveraging Recurrent Neural Networks
Detailed experiments showed that the one can collect high quality data that improves both automatic offline metrics and user engagement metrics when used for training models
Considering the importance of measuring and mitigating algorithmic bias in large-scale ML based applications, we presented the LinkedIn Fairness Toolkit, a system for scalable and flexible computation of fairness metrics during different stages of the ML lifecycle
In this paper we presented the first results from deploying the Snorkel DryBell framework for weakly supervised machine learning in a large-scale, industrial setting
In this paper we described Smart Compose, a novel system that improves Gmail users’ writing experience by providing real-time, context-dependent and diverse suggestions as users type
Similarity of the listing to the past views of the user, computed based on co-view embeddings. These models tap into data that isn’t directly part of the search ranking training examples, providing the DNN with additional information
In a standard Randomized Controlled Trials, the population is divided into control and treatment groups, all subjects in the treatment group are exposed to the change, and all subjects in the control group are exposed to no change