Skip to Main Content
PolyU Library

Online Tools for Assignment

Introduce useful online tools that may help to prepare your assignment.

Python


Python is a programming language that has been widely used in data processing and analysis by data scientists. There are several benefits to use Python:

  • Open source
  • Mature packages or libraries
  • Easy to read, learn and use
  • Well supported with established user communities

One of the advantages of using Python is that you can directly apply a vast array of pre-existing packages or libraries which are written by other users. You can find more Python libraries on Python Package Index (PyPI). Here are some popular Python libraries:

Data Collection

  • beautiful soup
    beautiful soup is a library for pulling data out of HTML and XML files.

Data Cleaning/Manipulation

  • NumPy
    NumPy is a fundamental library for scientific computing.
  • pandas
    pandas is a library for manipulation and analysis of tabular data. Get the cheatsheet.

Data Analysis

  • scikit-learn
    scikit-learn is a library for machine learning and predictive analysis.
  • NLTK
    Natural Language Toolkit (nltk) works with human language data and text analysis.
  • statsmodels
    statsmodels is a library for statistical analysis. Find more examples.

Data Visualization

  • matplotlib
    matplotlib is a plotting library which creates static and interactive visualizations.
  • seaborn
    seaborn is a statistical visualization library based on matplotlib.

Python Code Editors

Most researchers run their Python code on online computational notebooks like Jupyter Notebook and Google Colab.

  • Jupyter Notebook

You can launch Jupyter Notebook via Anaconda, one of the providers of Jupyter Notebook. Follow the steps to install Anaconda and refer to this beginner's guide to get started with Jupyter Notebook.

  • Google CoLab

Google CoLab is a cloud-based tool. Different from Jupyter Notebook, You can write and execute python code via Google CoLab without any installations. A Google account is required, and you can mount your Google Drive to write and read the files. Watch this short video tutorial to get started with Google CoLab.


Learn Python

The Library has subscribed to DataCamp, an interactive learning platform that allows you to build data skills at your own pace. You can find topics like data cleaning, data visualization, machine learning, data engineering, statistics, and more. Most courses are beginner-friendly!

There are many interactive learning materials for Python and other programming languages. Just register through Library's page to get started.


You can also learn how to use Python from the recorded workshops conducted by ITS.


Creative Commons License

Except where otherwise noted, the content of this guide is licensed under a CC BY-NC 4.0 LicenseNotify us if there are copyright concerns.