A10 Configuring Linux for scientific work
Description |
Linux is great for scientific work. This goes over some key things to install to get going. |
Video intro |
|
Reading |
>Software Carpentry set up material |
Questions |
>What kind of tools do I need? |
Aalto |
About Linux
Linux is an operating system known for its flexibility and power. It doesn’t hide things from the user, which makes it especially suitable for scientific computing, where you need to assemble your own pieces together and have full control. Because of it’s open-source spirit, many other open-source tools are developed for it
Linux is not just one thing: there are many distributions which
combine software. Which one to choose is basically user preference
(ask your friends what they use), but there are two major types:
Debian-based (uses apt
to install programs) and Red-hat
based (uses yum
to install programs). In practice, Ubuntu is
a good default these days. These instructions (so far) are for
Debian-based distributions like Ubuntu.
On Ubuntu, the standard way to install things is sudo apt-get
install $package_names ...
Shell tools
The shell provides an interface to efficiently access the true power of a computer. Now we use it to install tools but it can be used for many other tasks too.
Every Linux distribution comes with a shell already installed. Start the “Terminal” or “Shell” to see it. To verify, try running this:
$ echo $SHELL
/bin/bash
The convention is the $
represents lines you type (without the
$
- notice most shell prompts have it there already), the other
lines are what comes out. #
represents comments.
If you want a crash course on using the shell, see the Aalto shell crash course. You don’t need this right now.
Version control (git)
Using version control is like an insurance for your projects. It is not only about tracking changes but also to improve your project visibility and make it easier to collaborate.
Git is the most popular system for version control and GitHub is one of the services that provide online storing for projects.
This comes included in all operating systems, but needs to be installed. Here, we install git and some other useful frontends for it:
$ sudo apt install git gitk gitg
Verify from the shell (see above to start the shell):
$ git --version
git version 2.20.1
Your organization might provide you access to some other repository manager than GitHub but since GitHub is a higher availablity solution, it does not hurt to create an account there. You can sign up for Github here
Anaconda (Python)
In software development there are some standard packages that are useful to have without the trouble of installing them separately with their dependencies.
There are very many programming languages, and you probably won’t only use Python. But, it is quite common so we mention it here. We install the Anaconda distribution of Python: it gets you all the basic things you need, and can also install R and other programming languages, too. Anaconda is large and has all the most common tools people need - if you want to save space, install Miniconda instead (then you have to decide what extra packages you want).
This will get you Jupyter and many other Python things, too.
Anaconda allows you to manage your development environment which is good since you can have different environments dedicated to their designated purposes.
Todo
How to install it in the shell. How to start/use it. Easier install instructions. Link the SWC video.
To verify from the shell (see above to start the shell):
$ python3 -V
Python 3.6.8 :: Anaconda custom (64-bit)
$ conda info
active environment : None
...
base environment : /home/rkdarst/anaconda3 (writable)
Editor
It’s good to have one command-line editor and one graphical Integrated Development Environment.
Command line editor
For fast things, you want to be able to edit files quickly from a
the command line. Nano
is the simplest to use. If you want, you can check out vim or emacs,
but they certainly harder to use so we don’t recommend them to start
off.
To install nano
:
$ sudo apt-get install nano
Todo
Is this the most useful verification?
See this nano tutorial to learn more. To verify nano from the shell (see above to start the shell):
$ nano my_file.txt
Integrated Development Environment
** You should install one good Integrated Development Environment (IDE). This has coding, version control, and many more things build in to one interface. These days, VSCode is the most popular. Install from the vscode website. Out of principle, we recommend you disable data collection.
Emacs can also serve as an IDE once you learn enough about it.
Jupyter
Jupyter is an interactive way to explore data and do programming. It can be used to add code, output, titles, text and visualisations into one document. It’s already installed along with Anaconda. To start it in a certain directory, go to that directory in the shell and run:
$ jupyter notebook # older notebook interface
$ jupyter lab # newer JupyterLab interface
Follow this to install useful extensions to your environment. Especially ipywidgets are needed if you continue to do exercises.
Other programming tools
Install:
$ sudo apt install build-essential meld
build-essential
installs some basic compilers and so on.meld
: A graphical diff program
If you wish to obtain credits from the course, you might need
NumPy
Matplotlib
to complete exercises. These libraries are pre-installed with Anaconda installation. Further information about installations can be found here: NumPy and Matplotlib