-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathcourse1.Rmd
640 lines (439 loc) · 40.5 KB
/
course1.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
---
title: "Intro To Citrix And Linux"
author: "Sean, Bronson, Debbie and Marc"
date: "2/10/2016"
output: pdf_document
---
Welcome to the introductory course for computer resources here at SCRI. This course will cover a basic introduction to the SCRI Citrix environment and also will introduce you to basic tools in Linux. If you have a bad case of CLAS (command line aversion syndrome), then this course should help you get on the road to recovery.
# Introduction to Citrix:
### What is Citrix?
Citrix is a platform for having access to programs on many different machines. This allows a user to log in to any machine on the network and continue working on their projects without having to have the program installed on every machine. This is very useful for nurses and students who move between locations a lot. But why is this important?
### Why Citrix?
For most here at SCRI you likely avoid citrix like the plague or avoid using it because of the common perception that citrix is "bad". However when needed citrix provides a steady platform of available programs to help complete most work. This includes access to email, Microsoft Office, and share drives from home. In fact you can get access to log into your local machine via the citrix applications to utilize those programs specific to your computer!
### Using Citrix safely?
Most wouldn't guess that Citrix is actually being used by almost everyone. Almost all of the icons on your desktop, besides a select few, are populated when citrix loads. How you access citrix and how you save what you are working on makes a lot of difference when it comes to citrix. Knowing how to use your citrix can be the difference between saving an important document and having to rebuild from scratch. So we will go over some of the safest ways you can use citrix to your advantage!
### Getting Started
To start off please ensure you have completed the following:
1. You have logged on the PDN : *Hybridapproach*
2. Navigated to https://desktops.seattlechildrens.org
3. Logged in with Gemalto using your pin or dongle
4. Copied the Mobaxterm portable app to your "O:\ Drive"
Terms you might hear in this lesson are: VDI, VDS, Virtual Machine, Virtual Computer, XenDesktop, and XenApp. These are important to mention so you understand the differences.
* VDI = Virtual Device Infrastructure really meaning a hardware device that runs a virtual machine.
* VDS= Virtual Desktop Session is a session accessing software on a server.
* ACL = Access Control List Enables/Denies Communication between Computers and Servers.
* XenApp = Server that is pushing apps to your account.
* XenDesktop = Session loaded on VDI's and when you log into https://desktops.seattlehchildrens.org/
* Virtual Machine = Virtual Computer they are equal in meaning. It is a virtual computer running on a server off in a different location.
### Diagram 1

### Diagram 2:

### Key Take Aways
__Virtual Desktop Session__
* Files stored here persist, but are not backed up nor easily accessible
* O:\ drive is accessible
* Otherwise, no network access, therefore no internet and no access to Department shares
* No applications
__Citrix Environment__
* Applications are launched from a server, but use "local" resources
* Citrix applications save files by default to My Citrix Computer, but...
* My Citrix Computer (including the Desktop) is an application. Therefore, anything stored there is lost when the application closes.
* Citrix can access Department shares
* Citrix can access O:\ drive
* You cannot drag and drop files from local applications (i.e. windows opened from VDS or Children's Desktop) into Citrix applications.
* Use Remote Desktop to access your Children's desktop computer
__Children's Desktop Computer__
* Running locally installed applications will always be faster than Citrix applications
* You can request admin rights for your computer to allow you some control on installation of some applications
* Your hard drive is not backed up
* O:\ drive is accessible
* Departmental shares are accessible
#### Using the "O:\ Drive" or your group share as a single point of access is the safest way to save and access data because it is not only accessible everywhere you go but it is also backed up. This cannot be stressed enough!
***
#### <span style="color:blue">__Exercise C1:__</span>
1. Open "My Citrix Computer" From the desktop and pin it to the right of the screen. We will refer to this as "Citrix"
2. Create a file called textfile1.txt on your "O:\ Drive"
3. Open the VDS "MyComputer" by Clicking Start > Computer and pin that Window to the left of your screen.
4. Navigate to the text document on your "O:\ Drive" in the left Window on your VDS.
5. Try and drag it from your VDS to your Citrix.
a. (Notice how there is a Denial symbol?) This is because you cannot transfer between your VDS and Citrix. See Diagram 2.
b. Leave both Windows open for Exercise 2.
***
#### __Exercise C2:__
1. Minimize the VDS window.
2. Open "Microsoft Outlook 2010" from your desktop. Pin it to the left side of your screen.
3. Locate the "Dummy Email" sent to you this morning. If you did not receive it let me know.
4. Find the attachment in the "Dummy Email" from Outlook to your "O:\ Drive" on "Citrix"
a. (Notice how it allowed you to transfer the email there?)
5. Open a new email and drag the text document from your "O:\ Drive" on "Citrix" to the new email.
a. (Notice how it attached properly?)
Citrix can talk to Citrix and VDS can talk to VDS If you have ever struggled to attach that word document or important grant to your email and had trouble it is likely that you were trying to use a mixed environment scenario.
***
#### __Exercise C3:__
1. Open both VDS and Citrix windows the same way as in Exercise 1. VDS on the left, Citrix on the right.
2. Navigate to the "Desktop" on both.
a. (Notice that the amount of icons is different on both?)
3. Navigate to the "O:\ Drive" on your Citrix Desktop and Copy the "New Text File1.txt" that you saved earlier.
4. Now Navigate back to the "Desktop" in the Citrix window as before and paste the "NewTextFile1.txt"
5. Close out of the Citrix window.
6. Open "My Citrix Computer" Once again and navigate to the "Desktop"
a. (Notice how there is no text document on the desktop?)
Saving to the Citrix Desktop (the one with fewer icons), and closing out will cause you to lose what was saved there. This is because the Citrix window desktop is not actually an existing folder. Saving to the "O:\ Drive" or a department share is the single safest way to protect your data.
***
#### <span style="color:blue">__10 min break__</span>
***
# Computational Resources at SCRI
The Research Informatics Team is engaged in *transforming* the way SCRI interacts with and handles big data. Part of this initiative is providing access to more scientific computing resources. The first systems to come on a line are a pair of high performance Linux workstations. Keeping with the theme of "transforming research", these machines have been unofficially dubbed the Autobots, and been given the aliases Sunstreaker and Sideswiper.
Computer Name | Alias | Memory | Processor | Cores|Primary Function
--------------|-------|--------|-----------|------|----------------
EWRLNXRD28|Sunstreaker|128 GB| Dual Intel Xeon 2.3 GHz| 40 logical| Hosting home directory
EWRLNXRD29|Sideswiper|128 GB| Dual Intel Xeon 2.3 GHz| 40 logical | Hosting RStudioServer
__Coming soon__
Computer Name | Alias | Memory | Processor | Cores|Primary Function
--------------|-------|--------|-----------|------|----------------
--|Ratchet|256 GB| Dual Intel Xeon 2.5 GHz| 48 logical| Dedicated RstudioServer
--|Optimus|256 GB| Dual Intel Xeon 2.5 GHz| 48 logical | Head node
# Introduction to Linux:
### What is Linux?
Linux is a free open sourced operating system based on Unix. It's technically not Unix, but in practice it might as well be. Unix is like a shark of the computer world in the sense that the basic design has been around forever and is still good for what it does. It has not changed much because people are still finding it very useful and powerful to this day.
### Why do people use it?
People like Unix because it is powerful and flexible tool for efficiently getting certain kinds of work done. This is especially true for work that has never been done before or that requires the creation of customized work flows. If this sounds familiar that's because this is the same kind of work that people often need to do to manage data in science. And this is why so many scientists have found Linux to be a useful platform for developing new work flows. Everyone agrees that graphical interfaces can be nice. But the odds that you will find one that already does exactly what your experiment needs get vanishingly small starting with the moment you do anything interesting.
A question that came up and that perhaps you have asked, "Is this really worth it? Why should I invest my time in learning Linux?" The answer to this topic could be a course all by itself. Let us just try to point out what we think are a few of the advantages
* Many of the 'industry standard' tools used to analyze today's data are only available in Linux. GATK, tophat, bwa, cufflinks, samtools, many of these are open source tools that are meant to be distributed for free. These tools aren't readily available outside of the command-line environment
* Fine control. Whenever you make a graphical user interface, it becomes very hard to expose all the options that the program is capable of in an easy way. Therefore, you usually make assumptions and lock in some defaults. At the command-line, you can more easily allow the user to specify parameters and gain fine control of the program.
* Modularity. With a command-line interface, you can string small programs together like Legos and build your own custom workflows and programs that work exactly the way you want them to.
* Reproducibility. When you use command-line and/or write scripts, you create a record of exactly what you did to your data. Anytime you re-run the commands you wrote, you will get the same result. In contrast, if you use a GUI, you would have to meticulously record all the different things that you had tried just to have a chance of reproducing the same result again at some point in the future. If you forget something about how you used a GUI it can become a problem later. Suppose you forget that you used a special extra option from one to the custom drop down menus? Now suddenly you may no longer be able to make the exact same figure that you published in that paper...
* Speed. In the short-term, it may seem like a huge investment of time and energy to learn how to work in this environment and use these tools. The payoff comes though when you can write scripts to automate many tasks that once required hours of tedious repetition. Also, with Linux, you can take better advantage of parallel processing to vastly speed up the processing time.
* Scale. Most of the files that you may encounter, such as fasta, fastq, bam, etc, are enormous. Although they are often just simple text documents, most desktop computers simply can't handle them, so there is no way to open them or even 'take a peak' inside them. Dealing with large files and scaling up your analyses are trivial for Linux and you don't need any fancy software to do it.
# An Introduction to the command line:
> "An elegant weapon, for a more civilized age."
The command line is the main way that professional people interact with Unix. You might be tempted to think that something so old cannot possibly be as good as a more modern graphical interface. But the command line demonstrates how a simpler tool can often be superior to much more complicated solutions.
### Lets start by logging you in:
To connect to a Linux machine, one of the most common tools is `ssh` or 'secure shell'. You can use `ssh` to connect to a Linux machine if you have permission to do so like this:
```{bash, eval=FALSE}
ssh user@hostname
```
Normally ssh is not the first command you would teach a new student of the Linux command line, but in this case we want you to use this information to log into the sunstreaker machine so that we can get started.
### The prompt
When you first log in to a Linux terminal, you will see (with various levels of consternation) the __command prompt__. Like the dashboard of a car, the command prompt may look slightly different depending on the system, but in general the prompt will look something like this:
```{bash, eval=FALSE}
userid@hostname:/workingdirectory $
```
The __userid__ is the user name that is logged in to the system, in this case your SCH user id. The __hostname__ is the name of the computer you are logged into. Note that it is the real name of the computer, not the alias. Refer to the table in the Computational Resources section. The __working directory__,as the name implies, is the directory that you are currently working in (see the section on `pwd` below). The prompt ends with the `$` character and finally the cursor.
One common early mistake is to assume that anytime you can see the cursor, you must be able to give the computer a command. Indeed, the system will let you type anything you want at the cursor, even if it is busy. If you do that, your command will not execute until the system finishes whatever it's last command was. You can know that the system is busy whenever you see the cursor, but not the command prompt. In other words: *unless you see the full command prompt in front of the cursor*, the system is busy and is not ready to take a command.
Sometimes you may see other types of characters that look 'command prompt-ish':
* `+` This character indicates that whatever command you entered is incomplete and the system is waiting for more information. You can either supply the missing information, or you can return to the command prompt by hitting `CTRL-c`.
* `>` or `>>>`. These characters often indicate that you have entered into a different scripting environment, such as python or R. If you find yourself with this prompt and you didn't mean to, you can return to the command prompt by hitting `CTRL-c`, `exit()`, or `quit()`.
### Directories, executables and files
Directories are for storing stuff, executables do things to stuff and files are stuff. The equivalents for a Windows or Mac might be
* Directories = folders
* Executables = applications/programs
* Files = files (text files, documents, etc)
On most command lines, directories will be colored to indicate that they are different from the files. And in Linux most executables are stored somewhere else, so you won't normally see those unless you are invoking them as commands.
### Root
One of the keys to working within Linux is understanding how the hierarchy of folders and directories is set up, and how to move around within that file structure. Let us take a moment to understand the Linux file system.
If you are used to interfacing with your computer via a graphical user interface such as Finder (Mac) or Windows Explorer (PC), the concept of a __root directory__ may be somewhat vague. The root directory is simply the most inclusive folder on the system, or in other words, the folder that contains all other folders and files. When working with the command line, no matter what system you use, you can designate an absolute path by describing its location relative to root. For Window's users, each disk drive has it's own root, for example `C:`, `D:`, etc. Unix based systems, including Mac OS X and Linux, have a single root designated simply by `/`.
### Home
__`/home`__ deserves a little special attention. __home__ is not to be confused with __root__. A home directory is merely a default location to store personal account settings and user-specific files. Note however, that home is not root. In Linux, your home directory is always located at `/home/username`. Because this is a common stop for so many functions, a convenient short hand is used to designate home: `~`.
Because `/home` belongs to you, it is one of the few places where you can explore and play around without crashing the system (usually). Because it has a convenient short cut when using `cd`, it is also a convenient place to create links to other locations in your system. It is a very tempting place to set up shop and store all of your data. After all, when you first log in to the system, this is where you land. Even in our exercises, we will have you start copying some example files here. Be aware though, that in our system, `/home` is not very big. __You only have 10 GB of space alloted to your `/home` directory.__ For that reason, while it is a good place to stash some small files that you want to persist on the system, you really don't want to store your actual data in `/home`. In the next class, we will discuss some better alternatives.
### Where am I? Using __`pwd`__
Your command prompt, by default, shows you the name of the directory where you are currently located. That's nice, but you can still feel lost unless you know where you are relative to root. This is how you know where you are:
```{bash, eval=FALSE}
pwd
```
```{bash, echo=FALSE}
cd ~
pwd
```
`pwd` means print working directory. It will tell you where in the file system you are by giving you the *path* starting with root. Think of the path as a road map of directories and sub-directories that tell you how to get from root to your folder or file or executable.
In Linux a full directory path will start with root: `/`, use more `/`'s in between as separators for each sub-directory and will often end with another `/`.
In this example, this shows me that I am currently in my home directory. *Note: We said that the command prompt tells you where you are. Remember that `~` is linux short-hand for your home directory.*
### What is in here? Using __`ls`__
The next question you might ask is what files are in my current directory? The command `ls` is used to list the contents of a directory.
```{bash, eval=FALSE}
ls
```
This is listing the contents of your current directory, which if you are following along is likely your own home directory. If this is the first time you have logged in to the system, this command will probably not show you anything. That doesn't mean that it is empty however. There are probably a number of hidden files. A more detailed listing of a directories contents can be found by adding a __parameter__ or __flag__ to the `ls` command:
```{bash, eval=FALSE}
ls -a
```
From this view you can see a number of files that begin with a period: `.`. These are configuration files. As you get to be more comfortable with Linux, you may choose to edit these files in order to customize your experience. For now, we will ignore them.
By default, `ls` lists the contents of your present working directory, or where you currently are. You could however list the contents of another directory:
```{bash, eval=FALSE}
ls /data
```
### How do I go somewhere else? Using __`cd`__
`cd` means change directory. You call that command followed by a single argument to indicate which directory you want to change to. So for example, if you wanted to go to the directory called `/data`, you would enter:
```{bash}
cd /data
pwd
```
Notice that when you do this, your command prompt changes to reflect the new location you are in.
### Some special symbols
This is a good time to mention a few special symbols that act as convenient Linux shortcuts:
* `~`: We have already talked about how this is short-hand for __home__
* `.`: A single dot is short hand for __here__ meaning your present working directory
* `..`: Two dots is short-hand for __the parent directory__ or the directory above your current location.
* `/`: The forward slash designates __root__
### Absolute paths vs relative paths
Understanding "Where am I now?"" is key to successfully operating in Linux. Very often you will need to specify the path to an input file, or the path to an output directory or destination. Let us say that you are in your home directory, and you enter the following command to list the contents of a directory called `data`:
```{bash, eval=FALSE}
cd ~
ls data
```
The path you gave to `data` is not very specific. Linux will interpret that command to be "From within the current working directory, find a directory called data, and list it's contents". What about this example:
```{bash, eval=FALSE}
ls ./data
```
Remember that `.` is short for "my current location". Linux interprets this command exactly the same as the first, but now you have been slightly more explicit.
```{bash, eval=FALSE}
ls ../data
```
Remember that `..` is short for "the parent directory of my current location", or "one level up". Linux will interpret this to mean "Go up one directory level from my current location, find a directory called data, and list it's contents."
Each of the previous three examples illustrate the concept of *relative paths*. In each case, how Linux interprets this command depends on where your current working directory is. If you `cd` to somewhere else, the result of that command will be different. What if you wanted to guarantee that no matter where your current working directory was, that the path would always be interpreted the same? Consider this example:
```{bash, eval=FALSE}
ls /data
```
Recall that `/` is short hand for "root". Linux will interpret this command to mean "Starting with root, find a directory named data and list it's contents". Because root is the most encompassing directory, then describing a path starting with root represents an *absolute path* or a *full path*. No matter where your current working directory is, absolute paths always refer to the same location.
It is worth noting that `~` is short for `/home/userid`, which is an absolute path reference. Therefore, any path reference that begins with `/` or `~` are absolute path references and will always yield the same result. All others will be interpreted relative to your current working directory.
Examples:
```{bash, eval=FALSE}
cd ~
ls .
ls ..
cd ../..
ls /
```
***
#### <span style="color:blue">__Exercise L1:__</span>
1. Using the `cd` command, navigate to your home directory. *Hint: your home directory is designated by `~`.*
2. Use the `pwd` to show the path do your home directory.
3. Use the `ls` command to see what files are in your home directory.
4. Use the `ls -a` command to see any hidden files in your home directory.
5. Use the `cd` command to move up a level to the parent directory of your home. *Hint: what is the short-hand for the parent directory of your location?*
6. Show the path to your current location.
7. Show the files that are in this current directory. Are there any hidden files?
8. Navigate up one more level from your current location. Show the path to this location. What is the name of this location? What files and directories are found here?
9. Navigate to the `/tools` directory. Look around at the contents of this directory. Try descending into a few of them and see what they contain.
10. Return to your home directory. Can you do it with a very simple command?
***
### How do I know how to use a command? Using __`man`__ and __`--help`__
In Linux, half the battle is knowing what commands are available to use. That comes through experience and judicious use of Google. The other half is knowing how to use them. For that, one of your first resources is the `man` command, short for *manual*. You can use the `man` command to pull up a manual about any Linux command like this:
```{bash, eval=FALSE}
man ls
```
At first glance this may not seem too helpful. Admittedly, some man pages are definitely more useful than others. Let's break down this page into it's components so you can understand how to use these pages.
First, how to navigate. `man` is opened in a special Linux text reader called `less` (which we will visit in more detail later). Command-line interfaces were invented before the mouse and scroll bars, so you have to re-learn how to navigate with no mouse. Use the up and down arrows to scroll up and down line by line. When you want to exit, hit `q`.
Now that you know how to move around inside the man page, lets look at some of the sections you are likely to encounter:
* __NAME__ This section gives the name of the command and simple statement of what it is supposed to do.
* __SYNOPSIS__ This section describes how the command is to be used and the *arguments* it expects. In computer science, an *argument* is the extra information and parameters that the user specifies to a command to modify it's behavior.
+ [OPTION] This indicates that the first argument you can pass is an optional parameter or flag. The parameters and flags that you can use are listed below under the Description section.
+ `...` This indicates that you can pass more than one of something. In this case, since if follows [OPTIONS], you can pass more than one option.
+ [FILE] This indicates that you could specify a particular file or path to list the contents of
+ `...` Again, this means that you could specify more than one file or path.
+ Note that anything enclosed in braces: `[]` are actually designated as optional arguments. In this case, neither OPTIONS nor FILE need be specified for the command to work.
* __DESCRIPTION__ This section gives more details about what the function done and describes in more detail what options are available for you to use.
Options can be roughly divided into two categories: *flags* and *parameters*
* __flags__ are typically options that you can set to turn certain behaviors on or off. Because they are typically yes/no, they don't require any further arguments. In Linux, flags are set using the minus sign `-` and are typically followed by a letter. You can set multiple flags at once by specifying multiple letters after the `-`. There's often a long form version of flags that is set using the double minus sign `--` followed by a word. The double minus usually distinguishes between a long form flag vs a set of multiple flags.
```{bash, eval=FALSE}
ls
ls -a
ls -l
ls -al
ls --all
```
* __parameters__ are special flags that require some additional arguments. For instance, you could specify the name of an input file or search string. Parameters look like flags, but are usually followed by `=` (or sometimes a space), and parameter value. As above, sometimes the parameter is designated with a word rather than a single letter. In this case, you would usually use a double minus `--`. (Just be aware that the double minus thing isn't a hard fast rule!)
```{bash, eval=FALSE}
ls --ignore=tools /
```
Because Linux is open sourced, software gets contributed from many different programmers. Sometimes, the programmers elect to not include a `man` page. An alternative route is to try the `--help` flag:
```{bash, eval=FALSE}
ls --help
```
As you can see, this returns essentially the same information.
### Getting more details about a directory
One of the options you may have noticed for `ls` is the `-l` flag, which is the flag for "long listing". This lists the contents of a directory with slightly more information.
```{bash, eval=FALSE}
ls -l /tools/sampledata/
```
```{bash, echo=FALSE}
ls -l /tools/sampledata/ | head -n 12
```
This long list format gives some additional useful information about the contents of the directory:
1. A string of 10 characters that describe the __permissions__ for the item.
2. An integer specifying the number of items contained within the directory. This value is always '1' for a file.
3. The userid for the owner of the item. The owner is usually the person who created it.
4. The groupid for the group that the item belongs to. See 'Permissions' below for more details.
5. The size of the file in bytes.
6. The month, day and time that the item was created or last modified.
7. The name of the item.
You may notice that the item `ctd` is colored differently. This is because it is a directory, as indicated by the leading character `d` of the permissions string.
You may also notice that the item `badtextfile.txt` is also colored differently. This is a hint that this item may be an executable, as indicated by the presence of the `x` characters in the permission string.
### What are permissions?
Lets look at a toy example of a long listed file:
```{bash, eval=FALSE}
-rw-r--r-- 1 username rleQAS_SCRI-Sudo 8.2K Oct 7 17:21 file.txt
```
Permissions are used by Linux to control who can do things. They can be applied to (u)sers, (g)roups, (o)thers, or (a)ll. (u)sers means the user who owns (created) the file while (g)roups mean the group that the file belongs to, and (o)thers means people who are not in these first two categories. In Linux everything belongs to someone, so ownership and permission to act accordingly is a big deal. Also true in Linux, is that groups of users are tracked and assigned to each file. But each of these categories (user, group or other) can have separate permissions set to control whether people are allowed to (r)ead, (w)rite or e(x)ecute them. Returning back to the mysterious string of ten characters mentioned above, it is actually broken into 4 sections:

1. The very first character doesn't actually have to do with permissions. It just indicates the kind of thing that something is. For example if it starts with `d`, then it's a directory etc. In general, you won't need to spend a lot of time thinking about this first character.
2. The next three characters: indicate the (r)ead, (w)rite and e(x)ecute policy for the (u)ser
3. The next three characters: indicate the (r)ead, (w)rite and e(x)ecute policy for the (g)roup
4. The last three characters: indicate the (r)ead, (w)rite and e(x)ecute policy for the (o)thers
Based on this, we can see that for the file above the permissions are set to be readable and writable for the owner of this file, but only readable by others in the same group, as well as by others not in the same group. If you ever need to change the permissions on a file, you would do that by using the `chmod` command. Also, you may not be allowed to change permissions for some files and directories depending on who they belong to and what kind of access privileges __you__ have.
***
#### <span style="color:blue">__Exercise L1:__</span>
1. Navigate to your home directory.
2. Navigate to the parent directory for all of the home directories by going up a level.
3. List the contents of this directory. You should see sub-directories for each user on the system. Are you allowed to see the contents of others directories? How do you know?
4. Use the man page for `ls` to find a flag that will allow you to see the file size in a more human readable form (kb, MB, GB, etc).
# End of Class
***
***
# Homework
For homework, you will continue practice navigating your way around a Linux file system using the `cd`, `ls`, and `pwd` commands. You will also learn three new commands: `mkdir`, `cp`, and `less` and practice using the associated man files.
#### <span style="color:blue">__Homework L1: `mkdir`__</span>
Sometimes you will need to make a new directory. For that there is the `mkdir` command.
1. Navigate to your home directory.
2. Make a new directory called "sandbox" in your home directory:
```{bash, eval=FALSE}
mkdir sandbox
```
3. We have just illustrated a very important feature of Linux: by default Linux only talks back to you if you do something wrong. If you give it a correct command that doesn't require the system to give you back any information (such as make a new directory), Linux silently obeys. To verify that you succeeded in making a directory, list the contents of your home directory.
4. Navigate into the sandbox directory that you just created.
5. Consult the man page for `mkdir`. Make a new directory inside 'sandbox' called 'myData' and include a flag in the command that tells `mkdir` to print a message when the directory is created. Use the short form of the flag.
6. Repeat \#5, making a directory inside 'sandbox' called 'yourData' using the long form of the flag to print a message when the directory is created. List the contents of sandbox to convince yourself that you have successfully created 'myData' and 'yourData'.
7. Navigate back to your home directory. From your home directory, create a new directory inside of sandbox as follows:
```{bash, eval=FALSE}
mkdir sandbox/ourData
```
This illustrates that you can create nested directories.
8. Navigate inside the newly created 'ourData'. Can you do it with a single `cd` command?
9. From within 'ourData', try to create a nested set of directories: `shared/flowData`. What was the result of this operation?
10. Consult the man page for `mkdir` and repeat \#8 using a flag that will tell mkdir to create any parent directories it needs. Also include the flag to tell it to print a message for each created directory.
#### <span style="color:blue">__Homework L2: `cp`__</span>
Often you will need to copy files from one location to another. For that there is the `cp` command.
1. Navigate to the following location: `/tools/sampledata/course1` and list the contents of this directory. This directory contains small example files of many kinds of file formats that you might expect to work with at some point.
2. Consult the Synopsis section of the man page for the `cp` command to see how to construct a copy command. You may find that there are several correct ways to construct the command. Remember that arguments enclosed in `[]` are optional. Copy the file `FEC00001_1.seq` into the myData directory that you created earlier. For SOURCE use the absolute path to the data file you want to transfer: `/tools/sampledata/course1/FEC00001_1.seq`. For DEST or DIRECTORY, use the absolute path to the destination location: `~/sandbox/myData`. Use the `ls` command to verify that you have transferred the file correctly.
3. You can use `cp` from anywhere so long as you specify absolute paths for your SOURCE and DEST. However, you can save yourself some typing if you first navigate yourself into either the SOURCE or DEST directories. This allows you to use relative paths, which will probably be shorter. Try the following. Remember that `.` is short-hand for "My current location":
```{bash, eval=FALSE}
## Example of using cp from within the SOURCE directory
cd /tools/sampledata/course1
cp FEC00002_1.seq ~/sandbox/myData
ls ~/sandbox/myData
## Example of using cp from within the DEST directory
cd ~/sandbox/myData
cp /tools/sampledata/course1/FEC00003_1.seq .
ls
```
4. Move back into your SOURCE directory. Transfer the following additional files into your myData directory using relative references: `Marrus_claudanielis.txt`, `structure_1ema.pdb`, `shaver_etal.csv`, and `nanodrop_abs.ndj`. Try to do it in a single command.
5. Try transferring the file `Optode run011.xls`. What was the result? Try copying again, but this time use this modified file name: `Optode\ run011.xls`. What does this tell you about using spaces in file names?
#### <span style="color:blue">__Homework L3: `less`__</span>
With Linux it is very easy to open up a text file and "take a peak" at what's inside, even if it is very big (such as a genome fasta file). The command to use is `less`. We will talk more about this in the next class, but in the mean time you can use it to take a peak at the files you have just copied. `less` is another of those programs that is designed to navigate by keyboard rather than mouse. The man file for `less` is very long, so we have summarized some of the most common keystrokes in this table to help you move around:
Key stroke | Function |\| | Key stroke | Function |
-----------|----------|-|------------|----------|
q|Quit|\||Up or Down Arrow |Move up or down a line|
`space`|Next page|\||`/`abc|Search for text "abc"|
b|Back a page|\||n|Find next occurrence of "abc"|
\#\# g|Go to line \#\#|\||?|Find previous occurrence of "abc"|
G|Go to end|\||h|Show help for `less`|
1. A pdb file is a file format for 3D structures of proteins and nucleic acids. It contains a header section that contains metadata about the structure (for instance details of how the structure was obtained, information about known secondary structures). It also contains coordinates in 3 dimensional space for each atom in the structure and the primary amino acid sequence. Open `structure_1ema.pdb` and look at it's contents using `less`:
```{bash, eval=FALSE}
cd ~/sandbox/myData/
less structure_1ema.pdb
```
2. Use the table to explore the contents. Try scrolling forwards and backwards through the file. Use the proper keystrokes to skip pages forward and backward. Search for the occurrence of the string "ASP" to find the coordinates of the aspartate residues in the structure.
3. To exit `less`, enter `q`.
4. Use `less` to explore the other file types you copied and note the differences in file content and structure. What happens when you try to open `Optode run011.xls`?
***
***
### Homework Solutions
#### <span style="color:blue">__Homework L1: `mkdir`__</span>
Sometimes you will need to make a new directory. For that there is the `mkdir` command.
1. Navigate to your home directory.
```{bash, eval=FALSE}
cd ~
```
2. Make a new directory called "sandbox" in your home directory:
```{bash, eval=FALSE}
mkdir sandbox
```
3. We have just illustrated a very important feature of Linux: by default Linux only talks back to you if you do something wrong. If you give it a correct command that doesn't require the system to give you back any information (such as make a new directory), Linux silently obeys. To verify that you succeeded in making a directory, list the contents of your home directory.
```{bash, eval=FALSE}
ls
```
4. Navigate into the sandbox directory that you just created.
```{bash, eval=FALSE}
cd sandbox
```
5. Consult the man page for `mkdir`. Make a new directory inside 'sandbox' called 'myData' and include a flag in the command that tells `mkdir` to print a message when the directory is created. Use the short form of the flag.
```{bash, eval=FALSE}
mkdir -v myData
```
6. Repeat \#5, making a directory inside 'sandbox' called 'yourData' using the long form of the flag to print a message when the directory is created. List the contents of sandbox to convince yourself that you have successfully created 'myData' and 'yourData'.
```{bash, eval=FALSE}
mkdir --verbose yourData
ls
```
7. Navigate back to your home directory. From your home directory, create a new directory inside of sandbox as follows:
```{bash, eval=FALSE}
cd ~
mkdir sandbox/ourData
```
This illustrates that you can create nested directories.
8. Navigate inside the newly created 'ourData'. Can you do it with a single `cd` command?
```{bash, eval=FALSE}
cd sandbox/ourData
```
9. From within 'ourData', try to create a nested set of directories: `shared/flowData`. What was the result of this operation? *Should result in an error because 'shared' does not exist yet.
```{bash, eval=FALSE}
mkdir shared/flowData
```
10. Consult the man page for `mkdir` and repeat \#8 using a flag that will tell mkdir to create any parent directories it needs. Also include the flag to tell it to print a message for each created directory.
```{bash, eval=FALSE}
mkdir -pv shared/flowData
```
#### <span style="color:blue">__Homework L2: `cp`__</span>
1. Navigate to `/tools/sampledata/course1` and list the contents there
```{bash, eval=FALSE}
cd /tools/sampledata/course1
ls
```
2. Construct a cp command to move FEC00001_1.seq from `course1` to `myData`
```{bash, eval=FALSE}
cp /tools/sampledata/course1/FEC00001_1.seq ~/sandbox/myData
ls ~/sandbox/myData
```
3. Example of using cp from within the SOURCE directory
```{bash, eval=FALSE}
cd /tools/sampledata/course1
cp FEC00002_1.seq ~/sandbox/myData
ls ~/sandbox/myData
```
3. Example of using cp from within the DEST directory
```{bash, eval=FALSE}
cd ~/sandbox/myData
cp /tools/sampledata/course1/FEC00003_1.seq .
ls
```
4. Example of Syntax 2 from man page. (Note the `\` in the second line...this just allows me to break up a command across multiple lines. It is only necessary because of the page width restrictions of this pdf document.)
```{bash, eval=FALSE}
cd /tools/sampledata/course1
cp Marrus_claudanielis.txt structure_1ema.pdb shaver_etal.csv nanodrop_abs.ndj \
~/sandbox/myData/
```
4. Alternative answer using Syntax 3 from man page.
```{bash, eval=FALSE}
cp -t ~/sandbox/myData/ \
Marrus_claudanielis.txt structure_1ema.pdb shaver_etal.csv nanodrop_abs.ndj
```
5. This should return an error. Because the name has a space in it, the command is expecting to find two files, one named 'Optode' and another named 'run011.xls'. Because neither of those files exists, the command fails.
```{bash, eval=FALSE}
cp Optode run011.xls ~/sandbox/myData/
```
5. Using the "\" before a space tells the command that the space is part of the file name. This is annoying though, so as a general rule we can see that putting spaces in file names is a bad idea.
```{bash, eval=FALSE}
cp Optode\ run011.xls ~/sandbox/myData/
```
***