README.md 10.3 KB
Newer Older
Johannes Kiesel's avatar
Johannes Kiesel committed
1
# Code Webis FAQ
Shahbaz Syed's avatar
Shahbaz Syed committed
2

Shahbaz Syed's avatar
Shahbaz Syed committed
3
# Quick Links
Johannes Kiesel's avatar
Johannes Kiesel committed
4 5 6 7 8 9 10 11
- [How to get started in webis?](#how-to-get-started-in-webis)
  - [How to get started with setting up my workstation?](#how-to-get-started-with-setting-up-my-workstation)
  - [How to get started with a project?](#how-to-get-started-with-a-project)
  - [How to get started with Git?](#how-to-get-started-with-git)
  - [How to get started with Docker?](#how-to-get-started-with-docker)
  - [How to get started with the working on a cluster?](#how-to-get-started-with-working-on-a-cluster)
- [How can I get access to various resources?](#how-can-i-get-access-to-various-resources)
- [How can I create my user account on a machine?](#how-can-i-create-my-user-account-on-a-machine)
Shahbaz Syed's avatar
Shahbaz Syed committed
12 13 14
- [How to organize my data on the webis machines?](#how-to-organize-my-data-on-the-webis-machines)
- [How many servers are available on the webis network?](#how-many-servers-are-available-on-the-webis-network)
- [How many clusters are available on the webis network?](#how-many-clusters-are-available-on-the-webis-network)
Shahbaz Syed's avatar
Shahbaz Syed committed
15
- [What are the project naming conventions?](#what-are-the-project-naming-conventions)
Shahbaz Syed's avatar
Shahbaz Syed committed
16 17
- [What are the different corpora available?](#what-are-the-different-corpora-available)
- [Where are the webis labs located?](#where-are-the-webis-labs-located)
Shahbaz Syed's avatar
Shahbaz Syed committed
18
- [Where to ask for help?](#where-to-ask-for-help)
Johannes Kiesel's avatar
Johannes Kiesel committed
19
- [Who are the admins?](#who-are-the-admins)
Shahbaz Syed's avatar
Shahbaz Syed committed
20

Johannes Kiesel's avatar
Johannes Kiesel committed
21
# How to get started in webis?
Shahbaz Syed's avatar
Shahbaz Syed committed
22 23 24 25 26 27 28 29 30 31
Welcome to the webis group! In order to have a smooth working experience in webis, you need to know a few things beforehand. Kindly refer to the following sections for more information :
1. [Access to the lab](#how-can-i-get-access-to-various-resources)
2. [Account creation](#how-can-i-create-my-user-account-on-a-machine)
3. [Setting up a workstation](#how-to-get-started-with-setting-up-my-workstation)
4. [Organizing data on the webis machines](#how-to-organize-my-data-on-the-webis-machines)
5. [Git & Gitlab](#how-to-get-started-with-git)
6. [Creating a project](#how-to-get-started-with-a-project)
7. [Docker](#how-to-get-started-with-docker)
8. [Working on Clusters](#how-to-get-started-with-working-on-a-cluster)

Johannes Kiesel's avatar
Johannes Kiesel committed
32
# How to get started with setting up my workstation?
Shahbaz Syed's avatar
Shahbaz Syed committed
33
Every student gets a lab workstation for their use
Shahbaz Syed's avatar
Shahbaz Syed committed
34

Shahbaz Syed's avatar
Shahbaz Syed committed
35 36 37
- Ask student assistants for account creation
  - Your account name will be your SCC account name
- After successfully logging in, you need to install additional software/tools on your machine
Michael Völske's avatar
Michael Völske committed
38
- Complete instructions for setting up your new workstation can be found [here](https://git.webis.de/code-generic/code-admin-knowledgebase/tree/master/workstations#installing-a-new-computer)
Michael Völske's avatar
Michael Völske committed
39
- Remote usage (see the yellow sticker on your machine for <number>, e.g webislab99):
Shahbaz Syed's avatar
Shahbaz Syed committed
40 41 42 43 44 45 46
    `ssh <user>@webis<number>.medien.uni-weimar.de`
- Check current memory/CPU load before starting your own process using `top` command

**Note**
- **Do not** shut down the machines
- **Do not** use the machines for sharing files
- You are responsible to keep the machines running and healthy
Michael Völske's avatar
Michael Völske committed
47
- Pay attention to the mails from the monitoring system
Shahbaz Syed's avatar
Shahbaz Syed committed
48 49
- If you see an error: fix it if possible and report in the mailing list

Johannes Kiesel's avatar
Johannes Kiesel committed
50
# How to get started with a project?
Shahbaz Syed's avatar
Shahbaz Syed committed
51 52 53
Git is the default version control system for all webis projects and thus a basic understanding is needed before starting to work on a project. Projects can be accessed and managed through Gitlab. For more information on Git, see [here](#how-to-get-started-with-git)
- For details on organizing your data on the webis network, see [here](#how-to-organize-my-data-on-the-webis-machines)
- For naming conventions of projects/ theses, see [here](#what-are-the-project-naming-conventions)
Shahbaz Syed's avatar
Shahbaz Syed committed
54

Shahbaz Syed's avatar
Shahbaz Syed committed
55
# How to get started with Git?
Shahbaz Syed's avatar
Shahbaz Syed committed
56 57 58 59 60 61 62 63 64
- Main [URL](https://git.webis.de) 
- [Getting started and Access](https://git.webis.de/code-generic/code-webis-cmd/wikis/git-setup)
- [Git basics](https://git.webis.de/help/gitlab-basics/README.md)
- **Structure**
  - Group tools : `aitools`
  - Libraries : `thirdparty`
  - Student projects/thesis : `webisstud`
  - Staff projects : own Gitlab group
- **Staff project structure**
Shahbaz Syed's avatar
Shahbaz Syed committed
65
```
Shahbaz Syed's avatar
Shahbaz Syed committed
66 67 68 69 70 71 72 73 74
    <project>/
      data/       Input/result data (put intermediate data into .gitignore)
      doc/        Documentation files, including presentations
      material/   Papers, books, links, ... 
                  Filenames: <last-name-first-author><two-digits-year>-<title>.[pdf, ...]
      resources/  Program resources (word lists, rule lists, ...)
      src/        Source files (Java)
      src-<lang>/ Source files (For example src-tex for the thesis)
      test/       Unit test source files (Java)
Shahbaz Syed's avatar
Shahbaz Syed committed
75
```
Shahbaz Syed's avatar
Shahbaz Syed committed
76 77 78 79
- **For all filenames** : 
  - Use only lowercase letters, numbers, and the hyphen (-)
  - This includes: no spaces, no special characters, no uppercase letter

Shahbaz Syed's avatar
Shahbaz Syed committed
80 81 82 83 84 85 86 87 88 89
# How to get started with Docker?
- Docker allows you to run containerized applications in isolation with your host system
- You can run multiple containers on a single machine running different/connected services
- [Installation instructions](https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-16-04)
- For more explanation, kindly use the following resources :
  - [Official documentation](https://docs.docker.com/)
  - [Cheat sheet](https://github.com/eon01/DockerCheatSheet)
- At any given time there can be multiple containers running on a machine (especially Gammmaweb), being used by different people
- **Please ensure that you are removing the right containers before you stop/delete them**
- Check the list of containers running on a machine using `docker ps -a`
Shahbaz Syed's avatar
Shahbaz Syed committed
90

Shahbaz Syed's avatar
Shahbaz Syed committed
91 92
# How to get started with working on a cluster?
- Clone the aitools4-aq-cluster computing [repository](https://git.webis.de/aitools/aitools4-aq-cluster-computing)
Maik Fröbe's avatar
Maik Fröbe committed
93
- `cd docker` and follow the instructions from the [README](https://git.webis.de/aitools/aitools4-aq-cluster-computing/tree/master/docker/Readme.md) file of the repository
Shahbaz Syed's avatar
Shahbaz Syed committed
94 95
- Please note that this container is meant only for Spark/Hadoop jobs, small scale CPU models for Tensorflow/Theano
- In order to run GPU based neural networks, you need to use specific containers for your frameworks
Shahbaz Syed's avatar
Shahbaz Syed committed
96

Johannes Kiesel's avatar
Johannes Kiesel committed
97 98
# How can I get access to various resources?
You would need access to:
Shahbaz Syed's avatar
Shahbaz Syed committed
99

Shahbaz Syed's avatar
Shahbaz Syed committed
100 101 102 103 104
- a lab where you can find a workstation for yourself
  - ask the secretary to grant you access
- webis account password for creating an account
  - ask your supervisor. Kindly note that the password should not be shared with anyone
- network drive (/mnt/nfs/webis20) to store large amounts of data
Johannes Kiesel's avatar
Johannes Kiesel committed
105
  - you should already have access to it (/mnt/nfs/webis20) after creating an account. If you face any problems, kindly ask one of the [administrators](#who-are-the-admins)
Shahbaz Syed's avatar
Shahbaz Syed committed
106

Johannes Kiesel's avatar
Johannes Kiesel committed
107
# How can I create my user account on a machine?
Shahbaz Syed's avatar
Shahbaz Syed committed
108
Login into your machine with the webis account for the first time and add a new user/account for yourself. Logout and login with your new credentials
Shahbaz Syed's avatar
Shahbaz Syed committed
109

Johannes Kiesel's avatar
Johannes Kiesel committed
110
# How to organize my data on the webis machines?
Shahbaz Syed's avatar
Shahbaz Syed committed
111
![Organizing your projects](https://d2mxuefqeaa7sj.cloudfront.net/s_9622731C8FA33BE0C917B1BFD0F8F1DD79A33AD5A95A1D2F359CC63A0FD80D8D_1512642836682_image.png)
Shahbaz Syed's avatar
Shahbaz Syed committed
112

Shahbaz Syed's avatar
Shahbaz Syed committed
113
# How many servers are available on the webis network?
Shahbaz Syed's avatar
Shahbaz Syed committed
114 115 116 117 118 119 120

For larger experiments

- Servers webis-17-19, 60 and 62 are (usually) available
- Ask staff first, respect other user’s processes
- **Never run** experiments as the user **webis**
- Create your own account (SCC login)
Shahbaz Syed's avatar
Shahbaz Syed committed
121

Shahbaz Syed's avatar
Shahbaz Syed committed
122 123 124 125 126 127 128 129 130 131 132 133
| **Machine** | **CPUs** | **RAM** | **Storage** | **Notes**                    |
| ----------- | -------- | ------- | ----------- | ---------------------------- |
| webis16     | 16       | 72 GB   | 2 x 9 TB    | ***Reserved: Web services*** |
| webis17     | 16       | 72 GB   | 1 x 4.5 TB  |                              |
| webis18     | 16       | 72 GB   | 1 x 4.5 TB  |                              |
| webis19     | 16       | 72 GB   | 1 x 4.5 TB  |                              |
| webis20     | 20       | 100 GB  | 60 x 4.0 TB | ***Reserved: Storage***      |
| webis60     | 16       | 72 GB   | 1 x 600 GB  |                              |
| webis61     | 16       | 72 GB   | 6 x 136 GB  | ***Reserved: Netspeak***     |
| webis62     | 24       | 128 GB  | 8 x 2.7 TB  | ***4 x GTX480 GPUs***        |


Shahbaz Syed's avatar
Shahbaz Syed committed
134
# How many clusters are available on the webis network?
Shahbaz Syed's avatar
Shahbaz Syed committed
135
- We currently have 3 clusters → **Alphaweb, Betaweb, Gammaweb**
Shahbaz Syed's avatar
Shahbaz Syed committed
136 137 138 139 140 141
- Documentation for setup and monitoring tools can be found on our [Facilities page](https://www.uni-weimar.de/en/media/chairs/computer-science-and-media/webis/facilities/)
- **Alphaweb :** Old workstations wired together
- **Betaweb :** Server racks located in the DBL basement
- **Gammaweb :** Three very powerful machines with 8 GPUs
- **Deltaweb** : (soon)

Johannes Kiesel's avatar
Johannes Kiesel committed
142
# What are the naming conventions for projects and additional material?
Shahbaz Syed's avatar
Shahbaz Syed committed
143 144 145 146
- A project is named as `wstud-<project-name>-[ws,ss]<start-year>`
- A thesis is name as `wstud-thesis-<last-name>`
- `<user>` is your SCC user account (university login, 4 letters followed by 4 digits)
- Reference papers, books are named as `<last-name-first-author><two-digits-year>-<title>.[pdf, ...]`
Shahbaz Syed's avatar
Shahbaz Syed committed
147

Shahbaz Syed's avatar
Shahbaz Syed committed
148
# What are the different corpora available?
Shahbaz Syed's avatar
Shahbaz Syed committed
149

Shahbaz Syed's avatar
Shahbaz Syed committed
150
Our [Data page](https://webis.de/data.html) lists the corpora available on `betaweb` and on `/mnt/nfs/webis20/corpora` mounted on your workstation
Shahbaz Syed's avatar
Shahbaz Syed committed
151 152 153 154

- When you find/create an interesting corpus, tell us and we will add it


Shahbaz Syed's avatar
Shahbaz Syed committed
155
# Where are the webis labs located?
Shahbaz Syed's avatar
Shahbaz Syed committed
156

Johannes Kiesel's avatar
Johannes Kiesel committed
157
- See [Lab Space](https://webis.de/facilities.html#lab-space).
Shahbaz Syed's avatar
Shahbaz Syed committed
158

Shahbaz Syed's avatar
Shahbaz Syed committed
159 160 161 162 163 164 165 166 167 168 169
- Keep the labs clean
  - Remove bottles/snack packages
  - Write your notes, but don’t leave a mess of papers lying around
  - Wipe your desk
  - Leave your desk as you would like to find it
- When leaving the lab
  - **Do not** shut down the machines
  - **Close** all the windows
  - **Switch off** the lights


Shahbaz Syed's avatar
Shahbaz Syed committed
170
# Where to ask for help?
Shahbaz Syed's avatar
Shahbaz Syed committed
171 172 173 174 175 176 177 178
- Have a look at the [Documentation page](https://www.uni-weimar.de/en/media/chairs/computer-science-and-media/webis/facilities/#webis-documentation)
- Ask [Student assistants](https://www.uni-weimar.de/en/media/chairs/computer-science-and-media/webis/people/#webis-student-assistants) for help
- Ask on the mailing list : `webisstud@listserv.uni-weimar.de`
  - There is no such thing as a stupid question, feel free to ask anything
  - Ask for and also provide help if possible
  - Discuss things related to the webis group
- Discuss on the [Webis Discord](https://discordapp.com/invite/AucNCVX)
- Ask [Staff members](https://www.uni-weimar.de/en/media/chairs/computer-science-and-media/webis/people/#webis-assistants)
Johannes Kiesel's avatar
Johannes Kiesel committed
179 180 181 182

# Who are the admins?
- Students: kai.lorenz@uni-weimar.de
- Staff: michael.voelske@uni-weimar.de, johannes.kiesel@uni-weimar.de