Author: Joseph
Translator: Huang Xiaowei, is successively engaged in the game, the social and financial data, research and application – related work, now with the NetEase Hangzhou, expect more exchanges.
1 January 2018, R has CRAN included 200 new package, the clip 40 R package to do a simple introduction to function, the remaining packets can be found on the CRAN self – checking, including categories: data, scientific data, scientific, statistical, time series, and visualization etc.Currently, CRAN is becoming one of the practical, hard – won scientific knowledge.(ps: CRAN package portion included in the frame after the presence of possible, everybody can be a little more attention at the time of use)
a. data
1.: using the API for the Canadian census and geographic data to provide access interface.
2. providing a plurality of altitude data of the access permissions, and return space point data block() or the grid object.currently supports accessElevation Service, Service for Terrain, Amazon Web Services and USGS Tiles Terrain Elevation PointQuery Service Providers, etc.
3.: provide a hierarchical simulation and relevant data of the function.
4.: support for the rapid introduction of the WHO TB data, and exploratory data analysis to provide a visualization function.
5.: National Center for Biotechnology Information gene database is to provide a packaging that allows search homolog genes across species.
6.: pure packet, including spectrum "transmittance" data, for frequent use of the filter and material, including plastic, film, optical glass and common glass, and some of the laboratory dish.
. 7: is a data set provides the access interface, support from a simple, reusable building blocks of the complex input pipeline.
8.: support the acquisition of an urban water supply and sanitation survey data that is transmitted by the urban poor federations – mechanism (cleaning water) is provided.
2. Data Science
1.: A Chinese restaurant process Pitman (1995) of the clustering method, this method does not need to be pre – determined number of clusters, while providing a correlation function calculation such as fuzziness of entropy values (1999).
. 2: is a neural network provides one advanced data interface.
3.: Through microscopic and macroscopic average to calculate the area under the ROC curve, providing a tool to solve multiple category classification problem.
4.Implementation and environment: Reinforcement Learning algorithm & Bartoš Sutton (1998).
5.: to provide a framework for the unsupervised anomaly detection problems.
6.: providing for parsing function R model object, and returns a result of the SQL query.
3. The Scientific
1.: provide database functions and resources, as the genome, transcriptome data in genetic variation annotations provide integrate framework, the wrap function to unify the many which have been published by the annotation tool interface, such as VEP,,, and, etc.
2.: provide a toolbox, so that the related disciplines of epidemiology and public health students and professionals are more likely to use R functions.
3.: provide a toolbox for animal two – dimensional trajectory.
4. Statistics
1.: provide a valid function, by means of automatic fitting GLM model.
2.: Dirichlet process allows for the creation of objects that can be used as an unlimited mixed model.Including density, Poisson process and the strength of the inference, hierarchical clustering and modeling, etc.
3.: provides a partial function, used on large data sets to perform density estimation, using both the distribution element tree generating unconditional / conditional random number.
4. Providing access to the generalized normality / exponential power distribution, quantile, and the random deviation of the density function.
5.: to provide a general algorithm, i.e. interpolation Regularization Optimization (IRO) algorithm, for processing high – dimensional missing data problem.
6.: provide Kriging model and spatial statistics of the various functions of the method, including the use of a reproducing kernel Hilbert space of multivariate sensitivity index analysis and calculation of Sobol.
7. The natural: in high – dimensional linear model, two kinds of error variance estimation method.
8.: using maximum likelihood estimation and Bayesian methods, are provided for operational risk modeling function.
9.: implementation of the relevant network centrality analysis method, primarily through a neighborhood including a position of advantage or to obtain parts of the rankings.
10.: PALM tree algorithm, this algorithm is the MOB (package), where some of the parameters in all groups is fixed.
11.: provide calculated as a function of many different types of pairwise multiple comparison test.
12.: implemented for constructing a PLS structural equation model of domain – specific language, by balancing the interests of consistency of estimating the latest PLS method Dijkstra & Henseler (2015),Interactive adjustment Henseler & Chin (2010).
5. The time sequence
1.: is a short sequence of analysis provide a graphical pipeline automation functions, that are designed to accommodate the asynchronous sampling time, the inter – individual variability, noise measurement variables and the like.
2. providing a time series representation method (such as principal component analysis method, preprocessing, feature extraction, etc.).
3. providing a set of interactive visualization tools that support the ts, mts, zoo and xts object time – series analysis, including for the performance prediction model, time series interactive charts and seasonal charts visualization function and the like.
6. The tool
1. The arrangements: as permutations, combinations and partitions provide quick generator and iteration, the user being able to save memory consumption is generated.
2. fs: In C library, implementing the file operation interface of the platform.
3. providing a polyline using Google image coding algorithms are simple elements (sf) and the coordinates of the object function.
4.: For a given R package, in the reverse direction is applied to the queuing function to support a plurality of staff persons tested in parallel.
5.: based on the Edgar F. Codd's relational algebra operators that implement the query name generator, and it is an object of the present invention is enhanced in the order of large data relies on the use of the "SQL" experience.
6.: tbl _ ts provide one class, for storing and managing data – centric format of temporal data.
7. The visualization
. 1: exploded view of a display by each of the variables of the contribution.
2. Secure adjpretty: Sigma. js, creating interactive graphics to provide visualization of the function.
Note: limited to a personal level, he gets wrong, pleading with the criticism, to welcome more exchanges.
text address: https: / /.. com / 2018 / 02 / 22 / jan – 2018 – top – 40 – picks – new – package /