-
Notifications
You must be signed in to change notification settings - Fork 269
/
Copy pathyaws.tex
4414 lines (3512 loc) · 178 KB
/
yaws.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[11pt,oneside,english]{book}
\usepackage{ifpdf}
\ifpdf
\usepackage[pdftex]{graphicx}
\pdfcompresslevel=9
\DeclareGraphicsExtensions{.png,.jpg,.pdf,.mps}
\else
\usepackage{graphicx}
\DeclareGraphicsExtensions{.ps,.eps}
\fi
\usepackage[T1]{fontenc}
\usepackage[latin1]{inputenc}
\usepackage{geometry}
\geometry{verbose,letterpaper,tmargin=1in,bmargin=1in,lmargin=1in,rmargin=1in}
\usepackage{babel}
\setcounter{secnumdepth}{3}
\setlength\parskip{\medskipamount}
\setlength\parindent{0pt}
\usepackage{url}
\usepackage{pslatex}
\usepackage[colorlinks=false]{hyperref}
\newcommand{\Erlang} % Write Erlang correctly
{{\sc Erlang}}
\newcommand{\Yaws} % Write Yaws correctly
{{\sc Yaws}}
\makeatletter
\usepackage[T1]{fontenc}
\usepackage{xspace}
%\usepackage{html}
\makeatother
\begin{document}
\title{Yaws - Yet Another Web Server}
\author{Claes Wikstrom\\
klacke@hyber.org}
\maketitle
\tableofcontents{}
\chapter{Introduction}
\begin{figure}[h]
\begin{center}
\includegraphics[scale=0.6] {yaws_head}
\end{center}
\end{figure}
\Yaws\ is an \Erlang\ web server. It's written in \Erlang\ and it uses
\Erlang\ as its embedded language similar to PHP in Apache or Java in Tomcat.
The advantages of \Erlang\ as an embedded web page language as opposed to
Java or PHP are many.
\begin{itemize}
\item{Speed - Using \Erlang\ for both implementing the web server itself as well
as embedded script language gives excellent dynamic page generation
performance.}
\item{Beauty - Well this is subjective}
\item{Scalability - due to the lightweight processes of \Erlang{}, \Yaws\
is able to handle a very large number of concurrent connections}
\end{itemize}
\Yaws\ has a wide feature set; it supports:
\begin{itemize}
\item HTTP 1.0 and HTTP 1.1
\item Static content page delivery
\item Dynamic content generation using embedded \Erlang\ code in the
HTML pages
\item NCSA combined/XLF/ELF log format traffic logs
\item Virtual hosting with several servers on the same IP address
\item Multiple servers on multiple IP addresses
\item HTTP tracing for debugging
\item An interactive interpreter environment in the Web server for use while
developing and debugging a web site
\item RAM caching of commonly accessed pages
\item Full streaming capabilities of both upload and download of dynamically
generated pages
\item SSL
\item Support for WWW-Authenticated pages
\item Support API for cookie based sessions
\item Application Modules where virtual directory hierarchies can
be made
\item Embedded mode
\item WebSockets (RFC 6455)
\item Long polling (COMET) applications
\item Forward and reverse proxying
\end{itemize}
\section{Prerequisites}
This document requires that the reader:
\begin{itemize}
\item Is well acquainted with the \Erlang\ programming language.
\item Understands basic Web technologies.
\end{itemize}
\section{A tiny example}
We introduce \Yaws\ by help of a tiny example.
The web server \Yaws\ serves and delivers
static content pages similar to any old web server, except that \Yaws\ does this
much faster than most web servers. It's the dynamic pages
that makes \Yaws\ interesting. Any page with the suffix ``.yaws'' is considered
a dynamic \Yaws\ page. A \Yaws\ page can contain embedded \Erlang\ snippets that
are executed while the page is being delivered to the WWW browser.
Example 1.1 is the HTML code for a small \Yaws\ page.
\begin{figure}[h]
\begin{verbatim}
<html>
<p> First paragraph
<erl>
out(Arg) ->
{html, "<p>This string gets inserted into HTML document dynamically"}.
</erl>
<p> And here is some more HTML code
</html>
\end{verbatim}
\caption{Example 1.1}
\end{figure}
It illustrates the basic idea behind \Yaws{}. The HTML code, generally
stored in a file ending with a ``.yaws'' suffix, can contain
\verb+<erl>+ and \verb+</erl>+ tags and inside these tags an
\Erlang\ function called \verb+out/1+ gets called and the output of
that function is inserted into the HTML document, dynamically.
It is possible to have several chunks of HTML code together with several
chunks of \Erlang\ code in the same \Yaws\ page.
The \verb+Arg+ argument supplied to the automatically invoked \verb+out/1+
function is an \Erlang\ record that contains various data which is interesting
when generating dynamic pages. For example the HTTP headers which were sent
from the WWW client, the actual TCP/IP socket leading to the WWW client.
This will be elaborated on thoroughly in later chapters.
The \verb+out/1+ function returned the tuple \verb+{html, String}+ and
\verb+String+ gets inserted into the HTML output. There are number
of different return values that can be returned from the \verb+out/1+ function
in order to control the behavior and output from the \Yaws\ web server.
\chapter{Compile, Install, Config and Run}
This chapter is more of a ``Getting started'' guide than a full
description of the \Yaws\ configuration. \Yaws\ is hosted on Github
at \url{ https://github.com/erlyaws/yaws }. This is where the source
code resides in a git repository and the latest unreleased version is
available via git through the following commands:
\begin{verbatim}
$ git clone https://github.com/erlyaws/yaws
\end{verbatim}
Released version of \Yaws\ are available at
\url{https://github.com/erlyaws/yaws/releases}.
\subsection{Compile and Install}
To compile and install a \Yaws\ release
one of the prerequisites is a properly installed \Erlang\ system. \Yaws\
runs on \Erlang\/OTP releases 23.0 and newer. Get \Erlang\ from
\url{http://www.erlang.org/}
Compile and install is straight forward:
\begin{verbatim}
# cd /usr/local/src
# tar xfz yaws-X.XX.tar.gz
# cd yaws-X.XX
# ./configure && make
# make install
\end{verbatim}
The \verb+make+ command will compile the \Yaws\ web server with the
\verb+erlc+ compiler found by the configure script.
\begin{itemize}
\item \verb+make install+ - will install the executable called
\verb+yaws+ in \verb+/usr/local/bin/+ and a working
configuration file in \verb+/usr/local/etc/yaws.conf+
\end{itemize}
Alternatively, you can compile \Yaws\ with \verb+rebar+ as follows:
\begin{verbatim}
# rebar get-deps compile
\end{verbatim}
If you want to build with SOAP support, run the following command:
\begin{verbatim}
# YAWS_SOAP=1 rebar get-deps compile
\end{verbatim}
To create a \Yaws\ release with \verb+reltool+, execute the following
command:
\begin{verbatim}
# rebar generate
\end{verbatim}
Because it bundles Erlang/OTP and all of the application's dependencies,
the generated release found in \verb+rel/+ is standalone and has no
external requirements. A future release of \verb+rebar+ will allow you
to create a slim release that doesn't bundle Erlang/OTP. This is not yet
available.
While developing a \Yaws\ site, it's typically most convenient to do a
\verb+local+ install and run \Yaws\ as a non-privileged user using
\verb+--prefix+ option of the \verb+configure+ script:
\begin{verbatim}
# ./configure --prefix=/path/to/yaws && make install
# /path/to/yaws/bin/yaws -i
\end{verbatim}
\subsection{Configure}
Let's take a look at the config file that gets written after a \verb+local+
install in \verb+/home/klacke/yaws/+. The file is
\verb+/home/klacke/yaws/etc/yaws/yaws.conf+:
\begin{figure}[h]
\begin{verbatim}
# first we have a set of globals
logdir = /home/klacke/yaws/var/log/yaws
ebin_dir = /home/klacke/yaws/lib/yaws/examples/ebin
include_dir = /home/klacke/yaws/lib/yaws/examples/include
...
# and then a set of servers
<server localhost>
port = 8000
listen = 127.0.0.1
docroot = /home/klacke/yaws/var/yaws/www
</server>
\end{verbatim}
\caption{Minimal Local Configuration}
\end{figure}
The configuration consists of an initial set of global
variables that are valid for all defined servers.
The only global directive we need to care about for now is the logdir.
\Yaws\ produces a number of log files. We start \Yaws\ interactively as
\begin{verbatim}
# ~/bin/yaws -i
Erlang (BEAM) emulator version 5.1.2.b2 [source]
Eshell V5.1.2.b2 (abort with ^G)
1>
=INFO REPORT==== 30-Oct-2002::01:38:22 ===
Using config file /home/klacke/yaws/etc/yaws/yaws.conf
=INFO REPORT==== 30-Oct-2002::01:38:22 ===
Listening to 127.0.0.1:8000 for servers ["localhost:8000"]
1>
\end{verbatim}
By starting \Yaws\ in interactive mode (using the command switch
\textit{-i}) we get a regular \Erlang\ prompt. This is most convenient
when developing \Yaws\ pages. For example we:
\begin{itemize}
\item{Can dynamically compile and load optional helper modules we need.}
\item{Get all the crash and error reports written directly to the
terminal.}
\end{itemize}
The configuration in Example 2.1 defined one HTTP server on address
127.0.0.1:8000 called "localhost". It is important to understand the
difference between the name and the address of a server. The name is
the expected value in the client HTTP \verb+Host:+ header. That is
typically the same as the fully-qualified DNS name of the server
whereas the address is the actual IP address of the server.
Since \Yaws\ supports virtual hosting with several servers on the same
IP address, this matters.
Nevertheless, our server listens to \textit{127.0.0.1:8000} and
has the name "localhost", thus the correct URL for this server
is \verb+http://localhost:8000+.
The document root (docroot) for the server is a copy of the \verb+www+ directory
in the \Yaws\ source code distribution. This directory contains a bunch of
examples and we should be able to run all those example now on the URL
\verb+http://localhost:8000+.
Instead of editing and adding files in the \Yaws\ \verb+www+
directory, we create yet another server on the same IP address but a
different port number --- and in particular a different document root
where we can add our own files.
\begin{verbatim}
# mkdir ~/test
# mkdir ~/test/logs
\end{verbatim}
Now change the config so it looks like this:
\begin{verbatim}
logdir = /home/klacke/test/logs
ebin_dir = /home/klacke/test
include_dir = /home/klacke/test
<server localhost>
port = 8000
listen = 127.0.0.1
docroot = /home/klacke/yaws/var/yaws/www
</server>
<server localhost>
port = 8001
listen = 127.0.0.1
docroot = /home/klacke/test
</server>
\end{verbatim}
We define two servers, one being the original default
and a new pointing to a document root in our home directory.
We can now start to add static content in the form of HTML pages,
dynamic content in the form of \verb+.yaws+ pages or
\Erlang\ \verb+.beam+ code that can be used to generate the dynamic
content.
The load path will be set so that beam code in the directory
\char`\~\verb+/test+ will be automatically loaded when referenced.
It is best to run \Yaws\ interactively while developing the site.
In order to start the \Yaws\ as a daemon, we give the flags:
\begin{verbatim}
# yaws -D --heart
\end{verbatim}
The \textit{-D} or \textit{--daemon} flags instructs \Yaws\ to run as
a daemon and the \textit{--heart} flag will start a heartbeat program
called heart which restarts the daemon if it should crash or if it
stops responding to a regular heartbeat. By default, heart will
restart the daemon unless it has already restarted 5 times in 60
seconds or less, in which case it considers the situation fatal and
refuses to restart the daemon again. The \textit{-heart-restart=C,T}
flag changes the default 5 restarts in 60 seconds to \textit{C}
restarts in \textit{T} seconds. For infinite restarts, set both
\textit{C} and \textit{T} to 0. This flag also enables the
\textit{--heart} flag.
Once started in daemon mode, we have very limited ways of interacting
with the daemon. It is possible to query the daemon using:
\begin{verbatim}
# yaws -S
\end{verbatim}
This command produces a simple printout of uptime and number of hits
for each configured server.
If we change the configuration, we can HUP the daemon using the
command:
\begin{verbatim}
# yaws -h
\end{verbatim}
This will force the daemon to reread the configuration file.
\chapter{Static content}
\Yaws\ acts very much like any regular web server while delivering
static pages. By default \Yaws\ will cache static content in RAM.
The caching behavior is controlled by a number of global
configuration directives. Since the RAM caching occupies memory,
it may be interesting to tweak the default values for the caching directives
or even to turn it off completely.
The following configuration directives control the caching behavior
\begin{itemize}
\item \textit{max\_num\_cached\_files = Integer}
\Yaws\ will cache small files such as commonly
accessed GIF images in RAM. This directive sets a
maximum number on the number of cached files. The
default value is 400.
\item\textit{max\_num\_cached\_bytes = Integer}
This directive controls the total amount of RAM
which can maximally be used for cached RAM files.
The default value is 1000000, 1 megabyte.
\item\textit{max\_size\_cached\_file = Integer}
This directive sets a maximum size on the files
that are RAM cached by \Yaws{}. The default value is
8000 bytes, 8 batters.
\end{itemize}
It may be considered to be confusing, but the numbers specified
in the above mentioned cache directives are local to each
server. Thus if we have specified \verb+max_num_cached_bytes = 1000000+
and have defined 3 servers, we may actually use $3 * 1000000$ bytes.
\chapter{Dynamic content}
Dynamic content is what \Yaws\ is all about. Most web servers are
designed with HTTP and static content in mind whereas \Yaws\ is
designed for dynamic pages from the start. Most large sites on the
Web today make heavy use of dynamic pages.
\section{Introduction}
When the client \verb+GET+s a page that has a ``.yaws'' suffix, the
\Yaws\ server will read that page from the hard disk and divide it in
parts that consist of HTML code and \Erlang\ code. Each chunk of
\Erlang\ code will be compiled into a module. The chunk of
\Erlang\ code must contain a function \verb+out/1+. If it doesn't the
\Yaws\ server will insert a proper error message into the generated
HTML output.
When the \Yaws\ server ships a \verb+.yaws+ page it will process it
chunk by chunk through the \verb+.yaws+ file. If it is HTML code, the
server will ship that as is, whereas if it is \Erlang\ code, the
\Yaws\ server will invoke the \verb+out/1+ function in that code and
insert the output of that \verb+out/1+ function into the stream of
HTML that is being shipped to the client.
\Yaws\ will (of course) cache the result of the compilation and the
next time a client requests the same \verb+.yaws+ page \Yaws\ will be
able to invoke the already-compiled modules directly.
\section{EHTML}
There are two ways to make the \verb+out/1+ function generate HTML
output. The first and most easy to understand is by returning a tuple
\verb+{html, String}+ where \verb+String+ then is regular HTML data
(possibly as a deep list of strings and/or binaries) which will simply
be inserted into the output stream.
An example:
\begin{verbatim}
<html>
<h1> Example 1 </h1>
<erl>
out(A) ->
Headers = A#arg.headers,
{html, io_lib:format("You say that you're running ~p",
[Headers#headers.user_agent])}.
</erl>
</html>
\end{verbatim}
The second way to generate output is by returning a tuple
\verb+{ehtml, EHTML}+ or \verb+{exhtml, EHTML}+. The exhtml variant
generates strict XHTML code. The term \verb+EHTML+ must adhere to the
following structure:
$EHTML = [EHTML] | \{TAG, Attrs, Body\} |
\{TAG, Attrs\} | \{TAG\} |\\*
\hspace*{0.75 in} \{Module, Fun, [Args]\} | fun/0 |\\*
\hspace*{0.75 in} binary() | character()$
$TAG = atom()$
$Attrs = [\{HtmlAttribute, Value\}]$
$HtmlAttribute = atom()$
$Value = string() | binary() | atom() | integer() | float() |\\*
\hspace*{0.55 in} \{Module, Fun, [Args]\} | fun/0$
$Body = EHTML$
We give an example to show what we mean. The tuple
\begin{verbatim}
{ehtml, {table, [{bgcolor, grey}],
[
{tr, [],
[
{td, [], "1"},
{td, [], "2"},
{td, [], "3"}
]
},
{tr, [],
[{td, [{colspan, "3"}], "444"}]}]}}.
\end{verbatim}
expands into the following HTML code:
\begin{verbatim}
<table bgcolor="grey">
<tr>
<td> 1 </td
<td> 2 </td>
<td> 3 </td>
</tr>
<tr>
<td colspan="3"> 444 </td>
</tr>
</table>
\end{verbatim}
At a first glance it may appears as if the HTML code is more beautiful
than the \Erlang\ tuple. That may very well be the case from a purely
aesthetic point of view. However the \Erlang\ code has the advantage
of being perfectly indented by editors that have syntax support for
\Erlang\ (read Emacs). Furthermore, the \Erlang\ code is easier to
manipulate from an \Erlang\ program.
Note that ehtml supports function calls as values. Functions can
return any legal ehtml value, including other function
values. \Yaws\ supports \verb+{M,F,[Args]}+ and \verb+fun/0+ function
value forms.
As an example of some more interesting ehtml we could have an
\verb+out/1+ function that prints some of the HTTP headers. In the
\verb+www+ directory of the \Yaws\ source code distribution we have a
file called \verb+arg.yaws+. The file demonstrates the \verb+Arg+
\verb+#arg+ record parameter which is passed to the \verb+out/1+
function.
But before we discuss that code, we describe the \verb+Arg+ record
in detail.
Here is the \verb+yaws_api.hrl+ file which is in included by default
in all \Yaws\ files. The \verb+#arg{}+ record contains many fields
that are useful when processing HTTP request dynamically. We have
access to basically all the information associated with the client
request such as:
\begin{itemize}
\item The actual socket leading back to the HTTP client
\item All the HTTP headers -- parsed into a \verb+#headers+ record
\item The HTTP request -- parsed into a \verb+#http_request+ record
\item \verb+clidata+ -- data which is \verb+POST+ed by the client
\item \verb+querydata+ -- this is the remainder of the URL following
the first occurrence of a '?' character, if any.
\item \verb+docroot+ -- the absolute path to the docroot of the
virtual server that is processing the request.
\end{itemize}
\begin{verbatim}
-record(arg, {
clisock, % the socket leading to the peer client
client_ip_port, % {ClientIp, ClientPort} tuple
headers, % headers
req, % request (possibly rewritten)
orig_req, % original request
clidata, % The client data (as a binary in POST requests)
server_path, % The normalized server path
% (pre-querystring part of URI)
querydata, % For URIs of the form ...?querydata
% equiv of cgi QUERY_STRING
appmoddata, % (deprecated - use pathinfo instead) the remainder
% of the path leading up to the query
docroot, % Physical base location of data for this request
docroot_mount, % virtual directory e.g /myapp/ that the docroot
% refers to.
fullpath, % full deep path to yaws file
cont, % Continuation for chunked multipart uploads
state, % State for use by users of the out/1 callback
pid, % pid of the yaws worker process
opaque, % useful to pass static data
appmod_prepath, % (deprecated - use prepath instead) path in front
% of: <appmod><appmoddata>
prepath, % Path prior to 'dynamic' segment of URI.
% ie http://some.host/<prepath>/<script-point>/d/e
% where <script-point> is an appmod mount point,
% or .yaws,.php,.cgi,.fcgi etc script file.
pathinfo, % Set to '/d/e' when calling c.yaws for the request
% http://some.host/a/b/c.yaws/d/e
% equiv of cgi PATH_INFO
appmod_name % name of the appmod handling a request,
% or undefined if not applicable
}).
-record(http_request, {method,
path,
version}).
-record(headers, {
connection,
accept,
host,
if_modified_since,
if_match,
if_none_match,
if_range,
if_unmodified_since,
range,
referer,
user_agent,
accept_ranges,
cookie = [],
keep_alive,
location,
content_length,
content_type,
content_encoding,
authorization,
transfer_encoding,
x_forwarded_for,
other = [] % misc other headers
}).
\end{verbatim}
There are a number of \textit{advanced} fields in the \verb+#arg+
record such as \verb+appmod+ and \verb+opaque+ that will be discussed
in later chapters.
Now, we show some code which displays the content of the \verb+Arg+
\verb+#arg+ record. The code is available in \verb+yaws/www/arg.yaws+
and after a \verb+local_install+ a request to
\url{http://localhost:8000/arg.yaws} will run the code.
\begin{verbatim}
<html>
<h2> The Arg </h2>
<p>This page displays the Arg #argument structure
supplied to the out/1 function.
<erl>
out(A) ->
Req = A#arg.req,
H = yaws_api:reformat_header(A#arg.headers),
{ehtml,
[{h4,[], "The headers passed to us were:"},
{hr},
{ol, [],lists:map(fun(S) -> {li,[], {p,[],S}} end,H)},
{h4, [], "The request"},
{ul,[],
[{li,[], f("method: ~s", [Req#http_request.method])},
{li,[], f("path: ~p", [Req#http_request.path])},
{li,[], f("version: ~p", [Req#http_request.version])}]},
{hr},
{h4, [], "Other items"},
{ul,[],
[{li,[], f("clisock from: ~p", [inet:peername(A#arg.clisock)])},
{li,[], f("docroot: ~s", [A#arg.docroot])},
{li,[], f("fullpath: ~s", [A#arg.fullpath])}]},
{hr},
{h4, [], "Parsed query data"},
{pre,[], f("~p", [yaws_api:parse_query(A)])},
{hr},
{h4,[], "Parsed POST data "},
{pre,[], f("~p", [yaws_api:parse_post(A)])}]}.
</erl>
</html>
\end{verbatim}
The code utilizes four functions from the \verb+yaws_api+ module. The
\verb+yaws_api+ module is a general purpose www API module that
contains various functions that are handy while developing
\Yaws\ code. We will see many more of those functions during the
examples in the following chapters.
The functions used are:
\begin{itemize}
\item \verb+yaws_api:f/2+ --- alias for \verb+io_lib:format/2+. The
\verb+f/2+ function is automatically \verb+-included+ in all
\Yaws\ code.
\item \verb+yaws_api:reformat_header/1+ --- This function takes the
\#headers record and unparses it, that is reproduces regular text.
\item \verb+yaws_api:parse_query/1+ --- The topic of the next section.
\item \verb+yaws_api:parse_post/1+ --- Ditto.
\end{itemize}
\section{POSTs}
\subsection{Queries}
The user can supply data to the server in many ways. The most
common is to give the data in the actual URL.
If we invoke:
\verb+GET http://localhost:8000/arg.yaws?kalle=duck&goofy=unknown+
we pass two parameters to the \verb+arg.yaws+ page. That data is
URL-encoded by the browser and the server can retrieve the data by
looking at the remainder of the URL following the '?' character. If
we invoke the \verb+arg.yaws+ page with the above mentioned URL we get
as the result of \verb+yaws_api:parse_query/1+:
$kalle = duck$
$goofy = unknown$
In \Erlang\ terminology, the call \verb+yaws_api:parse_query(Arg)+ returns
the list:
\begin{verbatim}
[{"kalle", "duck"}, {"goofy", "unknown"}]
\end{verbatim}
Both the key and the value are strings. Hence, a web page can contain URLs with
a query and thus pass data to the web server. This scheme works with any kind of
requests. It is the easiest way to pass data to the Web server since no form is
required in the web page.
\subsection{Forms}
In order to \verb+POST+ data a form is required. Say that we have a
page called \verb+form.yaws+ that contain the following code:
\begin{verbatim}
<html>
<form action="/post_form.yaws"
method="post"
<p> A Input field
<input name="xyz" type="text">
<input type="submit">
</form>
</html>
\end{verbatim}
This will produce a page with a simple input field and a submit button.
\begin{figure}[h]
\begin{center}
\includegraphics[scale=0.6] {a}
\end{center}
\end{figure}
If we enter something---say, ``Hello there''---in the input field and
click the submit button the client will request the page indicated in
the ``action'' attribute, namely \verb+post_form.yaws+.
If that \Yaws\ page has the following code:
\begin{verbatim}
out(A) ->
L = yaws_api:parse_post(A),
{html, f("~p", [L])}
\end{verbatim}
The user will see the output
\begin{verbatim}
[{"xyz", "Hello there"}]
\end{verbatim}
The differences between using the query part of the URL
and a form are the following:
\begin{itemize}
\item Using the query arg works with any kind of requests. We parse the query
argument with the function \verb+yaws_api:parse_query(Arg)+
\item If we use a form and \verb+POST+ the user data the client will
transmit the user data in the body of the request. That is, the
client sends a request to get the page using the \verb+POST+ method
and it then attaches the user data---encoded---into the body of the
request.
A \verb+POST+ request can have a query part in its URL as well as user
data in the body.
\end{itemize}
\section{POSTing files}
It is possible to upload files from the client to the server by means
of \verb+POST+. We indicate this in the form by telling the browser
that we want a different encoding. Here is an example form that does
this:
\begin{verbatim}
out(A) ->
Form =
{form, [{enctype, "multipart/form-data"},
{method, post},
{action, "file_upload_form.yaws"}],
[{input, [{type, submit}, {value, "Upload"}]},
{input, [{type,file}, {width, "50"}, {name, foo}]}]},
{ehtml, {html,[], [{h2,[], "A simple file upload page"},
Form]}}.
\end{verbatim}
As shown in the figure, the page delivers the entire HTML page with
enclosing \verb+html+ markers.
\begin{figure}[h]
\begin{center}
\includegraphics[scale=0.6] {b}
\end{center}
\end{figure}
The user gets an option to browse the local host for a file
or the user can explicitly fill in the file name in the input
field. The file browsing part is automatically taken care of by the
browser.
The action field in the form states that the client shall POST to a
page called \verb+file_upload_form.yaws+. This page will get the
contents of the file in the body of the \verb+POST+ message. To read
it, we use the \verb+yaws_multipart+ module, which provides the
following capabilities:
\begin{enumerate}
\item It reads all parameters --- files uploaded and other simple
parameters.
\item It takes a few options to help file uploads. Specifically:
\begin{enumerate}
\item \verb+{max_file_size, MaxBytes}+: if the file size in bytes
exceeds \verb+MaxBytes+, return an error
\item \verb+no_temp_file+: read the uploaded file into memory without
any temp files
\item \verb+{temp_file,FullFilePath}+: specify \verb+FullFilePath+ for
the temp file; if not given, a unique file name is generated
\item \verb+{temp_dir, TempDir}+: specify \verb+TempDir+ as the
directory to store the uploaded temp file; if this option is not
provided, then by default an OS-specific temp directory such as
\verb+/tmp+ is used
\item \verb+list+: return file data in list form; this is the default
\item \verb+binary+: return file data in binary form
\item \verb+return_error_file_path+: if an error occurs writing to a
file or if \verb+max_file_size+ is exceeded, return the file
pathname as part of the error (more on this below).
\end{enumerate}
\end{enumerate}
Note that the \verb+list+ and \verb+binary+ options affect only file
data, not filenames, headers, or other parameters associated with each
file. These are always returned as strings.
Call \verb+yaws_multipart:read_multipart_form+ from your \verb+out/1+
function and it returns a tuple with the first element set to one of
these three atoms:
\begin{itemize}
\item \verb+get_more+: more data needs to be read; return this tuple
directly to \Yaws\ from your \verb+out/1+ function and it will call
your \verb+out/1+ function again when it has read more \verb+POST+
data, at which point you must call \verb+read_multipart_form+ again
\item \verb+done+: multipart form reading is complete; a
\verb+dict+ full of parameters is returned
\item \verb+error+: an error occurred
\end{itemize}
The \verb+dict+ returned with \verb+done+ allows you to query it for
parameters by name. For file upload parameters, it returns one of the
following lists:
\begin{verbatim}
[{filename, "name of the uploaded file as entered on the form"},
{value, Contents_of_the_file_all_in_memory} | _T]
\end{verbatim}
or:
\begin{verbatim}
[{filename, "name of the uploaded file as entered on the form"},
{temp_file, "full pathname of the temp file"} | _T]
\end{verbatim}
Some multipart/form messages also headers such as \verb+Content-Type+
and \verb+Content-Transfer-Encoding+ for different subparts of the
message. If these headers are present in any subpart of a
multipart/form message, they're also included in that subpart's
parameter list, like this:
\begin{verbatim}
[{filename, "name of the uploaded file as entered on the form"},
{value, Contents_of_the_file_all_in_memory},
{content_type, "image/png"} | _T]
\end{verbatim}
Note that for the temporary file case, it's your responsibility to
delete the file when you're done with it. To ensure that you can do
this even when errors occur, include \verb+return_error_file_path+ in
the options you pass to
\verb+yaws_multipart:read_multipart_form/2+. Should an error occur,
the returned error tuple will be of this form:
\begin{verbatim}
{error, {Reason, ParamName, FileName}}
\end{verbatim}
If you supplied the \verb+{temp_file, FullFilePath}+ option,
\verb+FileName+ in the error tuple is the same as \verb+FullFilePath+,
otherwise it's the name of a temporary file \Yaws\ generated.
If the \verb+return_error_file_path+ option is not included, the
returned error tuple will instead be:
\begin{verbatim}
{error, {Reason, ParamName}}
\end{verbatim}
Here's an example of calling \verb+yaws_multipart:read_multipart_form/2+:
\begin{verbatim}
-module(my_yaws_controller).
-export([out/1]).
out(Arg) ->
Options = [no_temp_file],
case yaws_multipart:read_multipart_form(Arg, Options) of
{done, Params} ->
io:format("Params : ~p~n", [Params]),
{ok, [{filename, FileName},{value,FileContent}|_]} =
dict:find("my_file", Params),
AnotherParam = dict:find("another_param", Params);
%% do something with FileName, FileContent and AnotherParam
{error, Reason} ->
io:format("Error reading multipart form: ~p~n", [Reason]);
Other -> Other
end.
\end{verbatim}
Here, \verb+my_yaws_controller+ is a user-defined module compiled as
usual with \verb+erlc+ with the resulting \verb+.beam+ file placed in