Top R libraries Data Scientists must know in 2022

Moez Ali
4 min readJul 4, 2022
Photo by Clint Patterson on Unsplash

Introduction

Python and R are two most popular programming languages in 2022 when it comes to data science and machine learning. Both languages offer advantages of their own.

Since Python is a general-purpose programming language, it has a great ecosystem for software development, web development, automation, MLOps, etc. The capabilities for statistical analysis and complex data analysis is richer in R. However, Python is also catching up very fast there.

In this article, I will discuss some of the most useful R libraries data scientists must know in 2022. The list is subjective and is not organized in any specific order.

dplyr

Dplyr is a gold standard for data manipulation that offers a consistent collection of verbs to assist you in resolving the most typical problems with data manipulation:

  • mutate() adds new variables that are functions of existing variables
  • select() picks variables based on their names.
  • filter() picks cases based on their values.
  • summarise() reduces multiple values down to a single summary.
  • arrange() changes the ordering of the rows.

--

--

Moez Ali
Moez Ali

Written by Moez Ali

Data Scientist, Founder & Creator of PyCaret

No responses yet