-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathuser.doc
1838 lines (1555 loc) · 86.6 KB
/
user.doc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
KCC USER DOCUMENTATION
<1 About KCC>
KCC is a compiler for the C language on the PDP-10. It was
originally begun by Kok Chen of Stanford University around 1981 (hence
the name "KCC"), improved by a number of people at Stanford and
Columbia (primarily David Eppstein, KRONJ), and then adopted by Ken
Harrenstien (with help from Ian Macky) of SRI International as the
starting point for a complete and supported implementation of C.
KCC implements C as described by the following references:
ANSI: Draft Proposed ANSI C (as of 7-Dec-1988)
H&S: Harbison and Steele, "C: A Reference Manual",
HS1: (1st edition) Prentice-Hall, 1984, ISBN 0-13-110008-4
HS2: (2nd edition) Prentice-Hall, 1987, ISBN 0-13-109802-0
K&R: Kernighan and Ritchie, "The C Programming Language",
KR1: (1st edition) Prentice-Hall, 1978, ISBN 0-13-110163-3
KR2: (2nd edition) Prentice-Hall, 1988, ISBN 0-13-110362-8
Currently KCC is only supported for TOPS-10 and TOPS-20, although there
is no reason it cannot be used for other PDP-10 systems or processors.
The remaining discussion for the most part assumes you are on a TOPS-20
system.
<1 Using KCC>
C source files should have the extension ".C", such as PROG.C
and SUBS.C. To build a C program, whether from one or more source
files ("modules"), there are three things that must happen:
First, all modules have to be compiled with KCC to produce .REL
files (e.g. PROG.REL and SUBS.REL);
Second, the LINK loader must be invoked to load all of
the necessary modules into an executable core image;
Third, this image must be saved on disk as an .EXE file.
Every complete C program must contain one and only one module
that defines the function "main". This function is where control begins
when the program is executed, and unless otherwise specified the .EXE
file will be named after the module that "main" appears in.
You can make a C program either by using the TOPS-20 EXEC
commands COMPILE, LOAD, and SAVE, or by invoking KCC directly. For
example, suppose "main" is defined in PROG.C, and the file SUBS.C
contains auxiliary subroutines. Then,
To make: EXEC command Direct KCC invocation
------- ------------ ---------------------
PROG.EXE from .C files: @LOAD PROG,SUBS @CC -q PROG SUBS
@SAVE PROG
Just the .REL files: @COMPILE PROG,SUBS @CC -q -c PROG SUBS
PROG.EXE from .RELs: Same as 1st @CC PROG.REL SUBS.REL
One advantage of using the EXEC commands is that they will
only compile those files which appear to require it, i.e. modules for
which the .C file is more recent than the .REL file. The EXEC can also
translate TOPS-20 directory names into a format that the DEC loader will
understand, so that commands like @COMPILE <FOO>PROG are possible.
However, KCC will do a similar form of conditional compilation
if the -q switch is set, for those modules specified without a .C
extension. (This may become the default someday.) More commonly, the
EXEC at your site may not have been modified to know about KCC, or you
may wish to specify certain options to the compilation, or you may
just come from a UNIX background and feel more used to the direct
invocation method.
<1 Direct Invocation - Compiler switches>
The KCC compiler switches are intended to resemble those of the
UN*X "cc" command as closely as possible. If you are familiar with these,
you can probably use KCC instinctively. The command line is broken up into
argument strings each separated by a space (NOT by a comma). If an argument
string starts with a "-", it is a switch, otherwise it is a filename.
Case is significant in switches!
Normally, if a file exists for a given filename, that file is
always compiled regardless of what it contains or what the name looks
like. The exception is files with a ".REL" extension, which are never
compiled but are just passed on to the linking loader. If a filename does
not exist and appears to have no extension, ".C" is added. This feature
is primarily useful with the -q switch as it requests conditional
compilation. Case is not significant in filenames.
If none of -c, -E, or -S are given as switches, KCC will invoke
LINK after compilation and an executable file (*.EXE) will be produced.
The ordering of switches and filenames, in general, does not
matter; all switches are processed before compiling starts. However,
note that filenames and libraries will be compiled and/or loaded in
the order given, and -I paths will also be scanned in the order given.
It is possible to specify KCC switches while giving a
COMPILE-class command to the EXEC, if your EXEC recognizes the switch
/LANGUAGE-SWITCHES. The argument to this EXEC switch should be a
double-quoted string which starts with a space. For example:
@compile foo /laNGUAGE-SWITCHES:" -m -d=sym"
------------------------------------------------------------------------
The following are the available compiler switches, in alphabetical
order. They are the same as those used by UN*X "cc", except for the
ones marked with a "*", which are mainly of interest to KCC
implementors.
* -A<file> Specify a file name for the assembler header file (included
at the start of all assembler output).
-c Compile and assemble, but don't link (produce *.REL).
-C Retain comments in preprocessor (only useful with -E).
* -d Debugging output. Same as -d=all. Generates many debug files.
* -d=<fs> Debugging fine-tuning.
<fs> are flag names of particular kinds of debug output files.
The names can be abbreviated. Prefixing the name with a
'+' turns it on; '-' turns it off. All flags are initially
assumed off. Current flags are:
parse Parse tree output (*.DEB)
pho Peep-Hole Optimizer output (*.PHO) - HUGE!!!
sym Symbol table output (*.CYM)
all All of the above
E.g. "-d=parse+sym" == "-d=all-pho"
-D<ident> Define following ident to "1" or string after '='.
E.g. "-DMAXSIZE=25". Several of these may be specified.
-E Run source only through preprocessor, to standard output.
* -H<path> Specify a non-standard location for <>-enclosed #include files.
-i Loader: same as -i=extend.
* -i=<fs> Loader options.
<fs> are flags selecting particular options, as follows:
extend Load code for extended addressing (multi-section).
psect Load code using DATA and CODE PSECTs.
-I<path> Supply a search path for doublequoted #include files.
Several of these may be specified, and will be searched in
that order.
* -L<path> Loader: Specify a non-standard location for library files.
* -L=<str> Loader: Specify an arbitrary string argument to the loader.
Note that the syntax does not permit spaces to be included.
Several of these may be given.
-lnam Loader: Specify library filename for loader. The "nam"
argument is used to construct the filename LIBnam.REL in the
library directory path and this is searched when encountered
in the specifications.
* -m Use MACRO rather than FAIL. Semi-obsolete, same as -x=macro.
-O Optimize (no-op, defaults on). Same as -O=all.
* -O=<fs> Optimization fine-tuning. Mainly for debugging.
<fs> are flag names of particular kinds of optimizations.
The names can be abbreviated. Prefixing the name with a '+' turns
it on; '-' turns it off. All flags are initially assumed off,
so to ask for no optimization use -O= (same as -O=-all).
Current flags are:
parse Parse tree optimization
gen Code generator optimizations
object Object code (peephole) optimizations
all All of the above
E.g. "-O=parse+gen" == "-O=all-object"
-o=<file> Specify output filename for the executable image.
For UN*X-compatibility kicks, "-o <file>" also works.
* -P=<fs> Portability level specifications. Several switches may be given in
a format similar to that for -d and -O. The <fs> flags
specify the C implementation level that the compiler should use:
base Base level C -- most portable and restricted
carm H&S CARM level -- full implementation
ansi H&S CARM plus some ANSI -- working compromise
stdc ANSI C level (per latest Draft Proposed Standard)
Only one of the previous 4 is allowed, plus an optional:
kcc Permit KCC-specific extensions to the selected level.
The default is "stdc+kcc" if -P is not given. -P alone is
interpreted as "base".
* -q Conditional compilation. All file specs without an extension will
only be compiled if the .C file is more recent than the .REL file.
For example, "cc -q foo bar.c arf.rel"
compiles FOO.C if it is more recent than FOO.REL,
always compiles BAR.C, and never compiles ARF.
-S Don't assemble (produce *.FAI or *.MAC, plus *.PRE)
-U<ident> Undefine following identifier. All -U switches are processed
before any -D switches. See "predefined macros".
* -v Verbose - same as "-v=all".
* -v=<fs> Verbosity switches, similar to -d and -O.
fundef - print function names as they are defined (not yet).
stats - show statistics for run
load - show command string given to loader (if any)
-w Don't type out warnings.
* -x=<fs> Cross-compile switches. Several switches may be given in
a format similar to that for -d and -O. The <fs> flags
specify an aspect of the "target machine" that the
code should be compiled for (case is significant!):
Target System: tops20, tops10, waits, tenex, its
Target CPU: ka, ki, ks, kl0, klx
Target Assembler: fail, macro, midas
Target char size: ch7 (to compile with 7-bit chars)
e.g. "-x=ka+tenex". See "Cross-compiling".
------------------------------------------------------------------------
NOTE: <path> syntax
The -I, -H, and -L switches all take a "path" as argument.
This is interpreted as specifying both a prefix and a postfix string
which are used to sandwich a partial filename from some other source
(#include "xxx", #include <xxx>, and -lxxx respectively). The two
strings are separated by the character '+' (this is site dependent
however). Thus, for example:
Specification => Prefix Postfix Sample with "xxx"
-I+[SYS,NEW] "" "[SYS,NEW]" xxx[SYS,NEW]
-HNEWC: "NEWC:" "" NEWC:xxx
-LPS:<C>LIB+.REL "PS:<C>LIB" ".REL" PS:<C>LIBxxx.REL
NOTE: Obsolete features
The following switches and interpretations are obsolete. They will
likely be flushed altogether, but are documented here for historical reasons:
* -n same as -O= (no optimization)
* -s same as -d=sym (output *.CYM symbol table dump)
It used to be a feature that "simple" switches, which did not
take any arguments, could be lumped together into a single switch
string. For example, "cc -mS test" is the same as the more standard
"cc -m -S test". However, use of this feature is discouraged; the
potential confusion and inconsistency don't seem to be worth it.
NOTE: Switch Portability
The following lists the switches implemented by other systems
but not by KCC. This information seems useful and this is a convenient
place to put it. Other-system switches that KCC implements are not included.
Switches which mean one thing to KCC but another thing to other systems
are included. Currently only 4.2BSD switches are listed.
-g Output additional symtab info for dbx(1), pass -lg to ld(1)
-go Ditto for sdb(1).
-p Output profiling code for prof(1).
-pg Ditto but for gprof(1).
-R Passed on to as(1) to make initialized vars shared and read-only.
-Bpath Use substitute compiler pass programs specified by <path>.
-t[p012] Use only the pass programs from -B designated by -t.
ld(1) switches:
A, D, d, e, l, M, N, n, o, r, S, s, T, t, u, X, x, y, z
<1 User Program - Command line interpretation>
The C runtime startup interprets the command line to all
C programs in the same consistent fashion, and supports:
(1) argument string passing @PROG arg1 arg2 arg3 ...
(2) indirect files @PROG @fileof.args ...
(3) wild-card filenames @PROG file.*
(4) I/O redirection @PROG arg < infile > outfile
(5) pipes @PROG arg | PROG2 args
(6) background processing @PROG arg &
(7) Argument quoting @PROG "arg.*" "ar|g" "a>r<g"
There is also provision for suppressing the default command line
interpretation altogether.
(1) Command line arguments:
Command line arguments can be passed to the main() function
from the EXEC or monitor in the UN*X fashion. That is, main() is
given two arguments, the first of which is an argument count and
the second a pointer to an array of char pointers, each of which
constitutes an argument. It is conventional to declare the
parameters to main() in this way:
main(argc, argv)
int argc;
char **argv;
For example, if you have a C program saved as PROG.EXE, then invoking
PROG with the command:
@PROG one two
will set argc to 3, and the strings that argv points to will
be "PROG", "one", and "two". Note that arguments are separated by
blanks (whitespace) and not by commas!
(2) Indirect files:
If an argument begins with the character '@' it is interpreted
as an indirect file specification, and the rest of the argument is
assumed to be a filename; the contents of that file are parsed and
used as arguments. For example, if the file "two.txt" contained the
text "a b c" then the command:
@PROG one @two.txt three
would invoke PROG with the arguments "PROG", "one", "a", "b", "c", "three".
The format of the indirect file is the same as that for TOPS-20 indirect
files in general.
(3) Wild-card filename arguments
On TOPS-20, if any of the arguments contain the wild-card
characters '%' or '*' (without quoting -- see (7) below) then those
arguments are treated as wild-card filenames and are expanded into a
list of the filenames which match the pattern. For example,
@PROG foo.*
could expand into:
@PROG PS:<YOU>FOO.C.13 PS:<YOU>FOO.REL.5
(4) I/O redirection:
I/O redirection of stdin and stdout is also supported.
Thus:
1. @PROG <foo ; will take all stdin input from the file "foo".
2. @PROG >bar ; will send all stdout output to a new file "bar".
3. @PROG >>log ; will append all stdout output to the old file "log".
These can be combined:
@PROG <foo >bar ; does both 1 and 2. (from "foo", to "bar")
However,
@PROG <foo>bar ; interprets "<foo>bar" as a single argument string,
; because it looks like a <directory>filename.
(5) Pipes:
On TOPS-20 systems which implement the PIP: device (developed at
Stanford), pipes can also be supported, so that a command such as:
@PROG | BAZ
causes the stdout of program PROG to be redirected to the stdin of program
BAZ.
(6) Background processing:
Again, on TOPS-20 systems where the EXEC has been suitably
modified, a command line ending in an ampersand ('&') will cause the
program to be run in the background, while the user goes on to do other
things:
@PROG one two&
(7) Argument quoting:
Sometimes it is desirable to give an argument which contains
one of the special characters used by the above features. To quote
an argument string, you can surround it with either single or double
quote marks, as in "foo&bar" or 'foo&bar'. Anywhere on the line,
a backslash ('\') will quote the next character. A control-V (^V) will
also quote the next character, but will be retained in the argument
string; this is useful for filenames that have unusual characters,
because the ^V quoting must be passed along in system calls that refer
to such files.
(8) Suppressing the command line interpretation:
In certain unusual circumstances it may be necessary to suppress
the default command line interpretation, so that the user program itself
can handle it in a different way. For information on how to do this,
see the #include file <urtsud.h>.
<1 KCC Error Messages>
During normal compilation, KCC will merely announce the name of
each module as it is compiled; the assembler and linker make similar
announcements if they are invoked. However, if an error is encountered
a message will be printed in the following format:
"file.c", line 123: Undefined symbol: "bytsiz"
(getfil+6, p.3 l.45): et bytesize */ buf->st_blksize = ((512*bytsiz/
Each error message uses three lines: a Un*x-format error line, a
context line, and a blank separator line.
The first line has the same format as Un*x compiler errors. The
filename is given in quotes, followed by the line number from the start
of the file, and then the error message itself. If the error is
actually a "warning", the text "Warning - " will be at the start of the
message.
The second line is KCC-specific and is intended to provide more
context. The parenthesis-enclosed information at the beginning of the
line provides another way of locating the offending code; it gives the
name of the last function definition (plus an offset if within the
definition), and the page/line numbers within the file. Note that the
line number here is relative to the start of the indicated page, rather
than to the start of the file as for the Un*x-format line number. The
remainder of the context line is the last N characters of input seen by
KCC, with newlines converted to spaces. The very end of the line marks
the most recent character or token read by KCC. Because the compiler
does a lot of "peeking", often this is the next one after the actual
error location.
KCC always parses the entire source file regardless of how many errors
are encountered, in an attempt to report all possible problems. The
downside of this is that it is possible for a single error (such as a
missing '}' at the end of a function) to trigger a large number of
"spurious" following errors.
However, KCC stops producing assembly language output as soon as the
first error is seen, and will invoke neither the assembler nor the
linker at the end of compilation.
"Warning" messages:
Warning messages do not interfere with the rest of compilation;
they are for the user's information only. Normally it is best to fix
the source so that it no longer generates these warning messages, but if
necessary the warnings can be suppressed by using the -w switch.
"Internal error" messages:
If KCC is astonished by something extremely unpleasant, it may
generate an error message that starts with the words "Internal error - ".
These errors are due to some bug inside KCC and should be reported
as soon as possible, preferably with a sample source file that provokes
the error. KCC does attempt to continue.
"Out of memory" errors:
Under normal circumstances the regular-sized KCC has no
problems digesting source files; it can compile itself and the entire C
library, for example. But some programs may happen to be extremely
large, so large that KCC cannot hold all of the macro or symbol
definitions; in that case you will receive an "Out of memory" error
(which is always fatal). If this happens on TOPS-20, you can try using
CCX.EXE instead of CC.EXE; this is the same version of KCC, but built
with -i so that it runs with extended addressing and thus has much more
dynamically allocated memory available.
<1 C as implemented by KCC>
KCC is intended to conform to the description of C as described
in the latest Draft Proposed ANSI Standard (7-Dec-1988), plus
extensions described in Harbison & Steele's "C: A Reference Manual".
The -P (portability) switch controls the exact level at which
KCC attempts to compile a C program. There are four possible levels,
and only one of these may be in effect:
STDC - compiled code must conform to the ANSI C standard (or
the latest Draft Proposed standard). This is now the
default!
ANSI - The old default. Permits many ANSI constructs
to be recognized and compiled. This is basically CARM level
plus any new ANSI features that can be added without
significantly changing the language; thus, it is a
working compromise between CARM and STDC.
Users should be cautious about using ANSI features since
other compilers may not recognize them, or the features may
change before the standard becomes official.
CARM - Disables all ANSI-added features which are not in Harbison
and Steele's CARM book. KCC fully implements this level.
BASE - The most restrictive level. This disables some extensions
so that KCC will complain about some constructs
or usages that are likely to be unimplemented by some
other compilers. Good for ensuring that code is portable.
In addition, there is a "KCC extensions" flag which is independent
of the level; when enabled, this permits a number of KCC-specific extensions
to be recognized regardless of whatever level is in effect.
KCC now uses the STDC level with KCC extensions enabled;
this corresponds to "-P=stdc+kcc".
The next several pages document KCC's implementation of C by
following the general ordering of H&S and pointing out aspects where
KCC differs or describing which of several optional behaviors KCC
implements. Any ANSI features which are implemented are also described.
<2 ANSI Changes>
The major visible changes in KCC due to the new proposed ANSI standard are:
Input:
Trigraphs are recognized. Beware of existing "??" sequences.
Preprocessor:
Directives can have whitespace prior to #.
Formal macro parameters are NOT recognized in string/char literals.
New operators in macro body: # and ##.
New macro recursion rules.
A function-like macro will not be invoked if the next char is not '('.
No pragmas are recognized. They are ignored with a warning.
Constants:
New char escapes \a, \?
New type suffixes (U, F)
New char/string literal prefix (L)
Type qualifiers:
const, volatile
Function prototypes:
This is one of the most important changes.
New linkage defaults (omitted-extern etc)
New initializations:
Unions can now be initialized. A brace-enclosed expression will
initialize the first member of the union.
Automatic structures, unions, and arrays can now be initialized.
Initializer lists for these must use only constant expressions, but
auto structures and unions can be initialized with a single non-constant
expression of the same type.
The mechanism for initializing large auto aggregates is still
primitive, however; it consists of creating a static object of the same
type and copying that into the auto object when the block is entered.
Bear that in mind before trying to initialize large auto arrays.
Internal linkage:
File-scope identifiers with internal linkage (static variables
and functions are mapped into unique internally generated names, thus
normal C identifiers of up to 31 characters are always fully distinct.
New & operands:
The & (address-of) operator can now be applied to array and
function names. There is a difference between the ANSI meaning of
"&array" and the traditional meaning, however. Given "int arr[5];",
Expr Type
"arr" "array of 5 ints"
"arr+1" "pointer to int"
"&arr" Old: (if legal) "pointer to int"
ANSI: "pointer to array of 5 ints"
<2 KCC Lexical Elements> [H&S 2, "Lexical Elements"]
KCC uses the US ASCII character set. There is provision for
using a separate target character set, different from the source set,
but currently the only such is a target set for WAITS ASCII.
KCC has no maximum line length. The context displayed in error
messages ignores line boundaries.
KCC is standard in that nested comments are not supported. If
the sequence "/*" is seen within a comment, a warning message will be
printed just in case the user neglected to terminate the previous
comment.
<2 Identifier names>
KCC adheres to the standard definition of C identifier syntax,
allowing the character "_", the letters A-Z and a-z, and the digits
0-9 as valid identifier characters. Identifiers may have any length,
but only the first 31 characters (case sensitive) are unique during
compilation, which conforms to the ANSI minimum. This applies to all
of the following name spaces (as per H&S 4.2.4):
Macro names
Statement labels
Structure, union, and enum tags
Component (member) names
Ordinary names:
Enum constants and typedef names.
Variables (see discussion of storage classes).
However, the situation is different for symbols with external
linkage, which must be exported to the PDP-10 linker. Such names are
truncated to 6 characters and case is no longer significant. The
character '_' (underscore) is transformed into '.' (period); the PDP-10
software allows the additional symbol characters '$' and '%', but there
is no way to generate these with C unless special provision is made;
see #asm and '`' under "KCC Extensions". See also the discussion of
exported symbols.
<2 Reserved Words>
KCC has a number of additional reserved words depending on
the portability level setting. When KCC extensions are allowed, as
is normally the case, the following keywords exist:
"asm" - used for assembly code inclusion.
"entry" - only in certain special circumstances.
See the discussion of libraries and entry points.
_* - there are a number of keywords which begin with
the character "_". The user should never invent
such symbols, as they are all reserved for C
implementation purposes.
When ANSI or STDC level is in effect, there are three additional
reserved words. All can be considered type modifiers:
"signed" Indicates integer type is signed.
"const" Constant object
"volatile" Volatile object
<2 Constants>
The types "int" and "long" are the same -- one PDP-10 word of
36 bits, with the high bit a sign bit. Thus, the largest positive integer
constant is 0377777777777, or 34,359,738,368.
The type "double" is represented by a PDP-10 hardware format
standard range double precision number (two words). On KA processors
the format is slightly different. The decimal range is from 1.5e-39
to 1.7e38, with eighteen digits of precision.
Character constants have type "int". Multicharacter constants
of up to 4 chars are supported; they are right-justified in the word.
Because characters are 9-bit bytes, numeric escape code values can
range from '\0' to '\777'. Hexadecimal character constants are
permitted.
String constants are stored as 9-bit byte strings, and do not
share storage. That is, two instances of the constant string "foo"
will be stored in two distinct places. String constants are put in the
"pure" segment of a program along with the machine code, but this does
not actually enforce any read-only restrictions unless the user
executes a system call to protect that region.
If the portability level is ANSI or STDC then adjacent string
constants are concatenated into a single string. Thus, "foo" "bar" is
the same as "foobar".
<2 Preprocessor directives> [H&S 3, "The C Preprocessor"]
All standard C preprocessor directives are supported as described in
Harbison and Steele, including the new ANSI directives and operators.
This page specifies how KCC behaves for situations which are
implementation dependent.
Lexical Conventions: [H&S 3.2]
The '#' that starts a preprocessor directive can now be
preceded and followed by whitespace. Formal parameter names are no
longer recognized within character and string constants in macro body
definitions. Comments are treated as whitespace and not passed on to
anything else; however, KCC will print a "Nested comment" warning if it
encounters a comment which contains "/*". This serves both to catch
slightly non-portable usage (see H&S 2.2) and to detect places where
the user may have accidentally omitted a "*/".
Defining Macros: [H&S 3.3]
When defining a macro, formal parameter names are NOT recognized
within string and character constants; use the new '#' operator to stringize
macro parameters. Any comments and whitespace in the macro body
are replaced by a single space. KCC permits an argument token list
(arguments to a macro call) to extend over multiple lines. Arguments
to a call are converted in a fashion similar to that for macro bodies
-- comments and whitespace are replaced by a single space. Newlines
within an argument list are also considered whitespace. However,
string and character constants in arguments are treated as tokens, and
their contents are not scanned for macro names.
Predefined Macros: [H&S 3.3.4]
__LINE__ expands into the current decimal line number. (BSD)
__FILE__ expands into the current source filename. (BSD)
__DATE__ expands into the date of compilation, "Mmm dd yyyy".
__TIME__ expands into the time of compilation, "hh:mm:ss".
The date/time of compilation is cleared at the start of
compilation for each source file, and is set by the first
occurrence of __DATE__ or __TIME__ within that source file.
__STDC__ expands into the ANSI standard level #, currently 1.
__COMPILER_KCC__ expands into a string literal containing KCC
version information.
All macros but the last are specified by the ANSI standard. The first
two (__LINE__ and __FILE__) also exist in BSD Un*x; the next two
(__DATE__ and __TIME__) are also described in H&S. __STDC__ is only
defined when -P=stdc is in effect.
__COMPILER_KCC__ is the only non-standard predefined macro. If it is
defined, it implies that the file <c-env.h> also exists, which contains
standard KCC environment definitions. There are no other predefined
macros.
Undefining and Redefining Macros: [H&S 3.3.5]
It is not an error to redefine an already defined macro, but a
warning message will be output unless the new macro definition is the
same as the old definition; i.e. redundant definitions are allowed.
There is no macro definition stack, i.e. definitions are not
pushed/popped by #define/#undef. Attempting to define a macro named
"defined" will cause an error, since otherwise it would conflict with
the "defined" operator.
Converting Tokens to Strings: [HS2 3.3.8]
As per the ANSI standard, KCC no longer recognizes formal
parameter names within string and character constants. The '#'
operator should be used within macro bodies to convert macro arguments
into strings.
File Inclusion: [H&S 3.4]
Included files may be nested to 10 levels. Macro expansion
is done on the line if the filename does not start with '<' or '"'.
Filenames may contain '>' or '"' characters.
#include <filename> looks only in the standard directory.
#include "filename" looks first in DSK:,
then in the -I paths in order of specification (left to right),
then in the standard directory.
The standard directory for include files is C: on TOPS-10 and TOPS-20,
<KC> on TENEX, and [SYS,KCC] on WAITS, but this is site dependent in
any case.
Conditional Compilation: [H&S 3.5] #if,#else,#endif,#elif,#ifdef,#ifndef
The "defined" operator is recognized only within #if and #elif
expressions. Neither #elif nor "defined" will be recognized unless
the portability level is at least "carm". Within the body of a failing
conditional, only other conditional commands are recognized; all others,
even illegal commands, are ignored.
Explicit Line Numbering: [H&S 3.6] #line
The information from #line will be used in KCC error messages.
Macro expansion is performed on the line. Like all other
preprocessor commands, #line is eliminated and not passed on when
using the -E switch. With regard to "#" alone at the start of a line,
if there is no command name, the line is simply ignored without error
(as per ANSI).
KCC-specific Commands:
#asm, #endasm
These two commands cause the text delimited by them to be
macro-expanded (as for -E) and converted into an "asm()" expression
for direct inclusion in the output assembly language file. This
currently only works inside functions. This feature is very likely to
change, and should only be used where absolutely necessary. Keep the
code simple, as someday KCC may want to parse it.
See "KCC Extensions" for additional details.
<2 Storage classes> [H&S 4.3 "Storage Class Specifiers"]
KCC implements the ANSI standard storage classes of auto, extern, register,
static, and typedef, with the following notes:
REGISTER declarations are currently equivalent to AUTO. KCC does not
assign variables to registers, and optimizations are performed without
using the "hint" given by REGISTER. AUTO variables are almost always
more efficient, and in any case they are easier to implement. This
may someday change.
KCC now uses the ANSI concepts of linkage and definition to deal with
the question of top-level (file scope) definitions versus references.
This is similar to, but not the same as, the "omitted-extern" solution
in H&S sec 4.8 which KCC previously implemented. See the discussion of
exported symbols farther on.
Duplicate Declarations:
As per H&S 4.2.5, KCC permits any number of external
referencing declarations, if the types are the same. An external
reference may be later followed by a defining declaration.
<2 Initializers> [H&S 4.6 "Initializers"]
KCC adheres to ANSI and H&S in all required respects. The
following notes cover points which H&S describes as implementation
dependent:
Optional braces (as per ANSI) are allowed for all non-aggregate
initializers. It is permitted to drop braces from initializer lists
under the rules described in H&S 4.6.8 (HS1 4.6.9), but KCC attempts to
perform extremely stringent checking on the "shape" of initializers,
and will complain about too many or too few braces.
FLOATING-POINT initializers may be of any arithmetic type. KCC performs
compile-time floating-point arithmetic, so initializers for static and
external variables may use any constant arithmetic expression.
POINTER initializers, as described in H&S, must evaluate to an integer or
to an address plus (or minus) an integer constant.
ARRAY initializers are now allowed for automatic arrays, for level
ANSI or STDC.
ENUMERATION initializers may use any integer (as well as enum) expression.
STRUCTURE initializers can initialize bit-fields with any integer expression.
Automatic and register structures can now be initialized.
UNIONS can now be initialized. A brace-enclosed list will initialize
the first member of the union.
For automatic structures or unions, any expression with the type
of that struct/union can also be used as an initializer.
<2 Exported symbols> [H&S 4.8 "External Names"]
Symbols which are exported to the assembler file have special restrictions
imposed by current PDP-10 software, which only recognizes 6-character
symbols from the set A-Z, 0-9, '.', '$', and '%'. In particular, case
is not significant.
Also, there is a distinction between "local" symbols exported only to
the assembler and "global" symbols exported to both the assembler and
the linker. While there is technically no reason that any symbol has
to be given to the assembler if it is not also meant for the linker, in
practice it is convenient for debugging to have some "local" symbol
definitions available so that DDT can access them.
Here is a breakdown of export status by storage class:
typedef = Exports nothing. (Not a real storage class)
auto = Exports nothing. (Local stack variables use an internal offset)
register = Exports nothing. (Same as auto)
static = If not file scope (i.e. is within a block) then nothing exported;
an internally-generated label is used.
If file scope (within no block) then exported to assembler only.
A unique label is made, but no INTERN or ENTRY statement.
extern = May or may not be global, depending on previous declarations.
If not static (internal linkage), then it has external linkage, and is
always exported to both assembler and linker.
External DEFINITION: A label, INTERN, and ENTRY are output.
External REFERENCE: An EXTERN statement is output, but only
if the symbol is actually referenced by the code.
Tentative definitions:
External declarations seen for the first time, with or without
an explicit "extern" storage class, are assumed to be external
TENTATIVE DEFINITIONS. If no explicit reference or definition is seen
by the time the end of the file is reached, a definition with a zero
initializer is generated.
EXTERNAL LINKAGE symbols:
A defined external linkage symbol will have its own label, plus
an INTERN statement telling the assembler that this is an externally
visible symbol, plus an ENTRY statement which allows library routine
search to find this symbol. ENTRY statements will be put into the .PRE
output file rather than the main output file, since the assembler will
need to scan them prior to anything else.
A referenced external linkage symbol causes no output unless
the symbol is actually referenced by the code, in which case an EXTERN
line will be generated in the assembler output for that file. The
reason for the reference count check is that each assembler EXTERN
constitutes a library search request which must be satisfied by a
module with the corresponding symbol declared as an ENTRY. Unless this
is only done for actual references, the many superfluous declarations
found in *.h files will tend to cause many unneeded library modules to
be loaded.
INTERNAL LINKAGE symbols:
Note that global static symbols, which are internally generated,
are passed on to the assembler even though this is not necessary for
linkage purposes. The main reason this is done is to facilitate
debugging with DDT, otherwise it could be difficult to identify static
functions when looking at the machine instructions.
For identifiers with internal linkage, KCC maps each identifier into
a unique 6-character symbol. This mapping is done as follows:
(1) '%' is prefixed to the first 5 chars. If unique, done.
(2) Vowels and '_' are removed from the identifier, and (1) repeated.
(3) If the symbol resulting from (2) is still not unique, digits are added
at the end to fill it out to 6 characters, and then incremented
until a unique symbol is obtained.
Note that a symbol declared static within a given source file will
never be visible from another file that you may link later with it.
For example, a function declared as
static char *function()
{
...
}
will only be visible from other functions within the same source file.
This allows several modules to have functions with the same name, as
long as no two of the functions both have external linkage. It is
STRONGLY recommended for multi-module programs that you declare as many
functions as possible to be "static".
<2 Libraries and Entries>
REL files to be converted by MAKLIB into object libraries must have
any external symbols declared with ENTRY rather than merely INTERNing
them, and this declaration must be at the start of the REL file. In
order to do this, KCC generates a *.PRE "prefix" output file in
addition to the *.FAI or *.MAC output file, and invokes the assembler
in such a way that the PRE file is assembled before the main file.
This file contains ENTRY statements and any other predeclarations that
are needed before the assembler sees the actual code. Normally the
user will never see this file, but if the -S switch is used then it
will be left around as well as the FAI/MAC file. Note that if running
the assembler manually on the FAI/MAC file, you must invoke it with
a command line like this:
[@]FAIL [@]MACRO
[*]FOO=FOO.PRE,FOO.FAI [*]FOO=FOO.PRE,FOO.MAC
COMPATIBILITY INFO:
For compatibility, KCC will continue to recognize an "entry"
keyword for some time to come. The following describes the obsolete
syntax:
To declare an entry, use the "entry" keyword at the start of the source,
before any other declarations:
"entry" ident ["," ident ...] ";"
i.e., the keyword "entry", followed by a list of identifiers separated
by commas, followed by a semicolon. This is passed on essentially
verbatim to the assembler, and has no other affect on compilation. It
should be used at the start of any runtimes or other file intended for
a library, on all variables and functions that should be visible as
entries in the library.
Note that it should still be safe to use "entry" as a non-keyword; if
used other than at the start of the file it will be treated like any
other normal identifier.
To repeat: the "entry" statement is no longer necessary. It should not
be used in new code, and should be removed from old code.
<2 Types> [H&S 5 "Types"]
STORAGE UNITS:
A KCC storage unit (what "sizeof" returns) is a 9-bit byte, and
there are 4 of these in each 36-bit PDP-10 word, ordered left to right
from most significant to least significant.
INTEGERS:
KCC's integer types have the following sizes:
Type Bits "sizeof" value
char 9 1
short 18 2 (PDP-10 halfword)
int 36 4 (PDP-10 word)
long 36 4 (PDP-10 word)
All of these types may be explicitly declared as "signed". Single
variables declared as "char" or "short" are stored right-justified into
a full word; only when packed into an array or structure are they
stored as 9-bit (or 18-bit) bytes, left to right within each word.
UNSIGNED INTEGERS:
Unsigned integers are fully implemented; any integer object
may be either "signed" or "unsigned", and both forms use exactly the
same amount of storage, with the high order bit considered the sign
bit (if the object is signed). However, because the PDP-10 has
no instructions specifically for unsigned data, some operations are
slower for unsigned ints.
Addition (+) and subtraction (-) are the same.
== and != are the same.
Left shift (<<) always uses the LSH instruction (logical shift).
Right shift (>>) uses LSH for unsigned, ASH for signed operands.
ASH is an arithmetic shift which propagates the sign bit.
<,<=,>,>= are slightly slower for unsigned operands.
Casts to floating-point are slower.
Multiply (*) is also slightly slower.
Divide (/) and remainder (%) are much slower.
CHARACTER:
The plain "char" type is "unsigned char". Sign extension is
done only if chars are explicitly declared as "signed char". Normally
a char is 9 bits, although it is possible to compile code using a
7-bit assumption (see the section on char pointer hints).
An extension to KCC provides five additional types of "char"
objects, specified as "_KCCtype_charN", where N is the number of bits
in the char and may be one of 6, 7, 8, 9, or 18. All may be signed
or unsigned; their "plain" form is unsigned. See the "KCC Extensions"
section for additional details.
FLOATING-POINT:
The "float" type is represented by one word in the PDP-10
single precision floating point format; there is one bit of sign, 8
bits of exponent, and 27 bits of mantissa.
The "double" type uses two words in the PDP-10 double
precision format; there is one bit of sign, 8 bits of exponent, and
62 bits of mantissa. (Note that on the KA-10 this is a software format
with 54 mantissa bits, rather than the more usual hardware format.)
The exponent range is approximately 1.5e-39 to 1.7e38 in both
formats; single precision has about 8 significant digits and double
precision has 18. See a PDP-10 hardware reference manual for more details.
KCC also supports the new ANSI "long double" type. Currently
this is the same as "double" but this might someday change on KL-10s to
use "G" format floating point, which has an exponent range of 2.8e-309
to 9.0e307 but only 17 significant digits.
The (double) type can represent all values of (long). That
is, conversion of a (long) to a (double) and back to (long) results in
exactly the original value.
POINTERS:
Pointers are always a single word, but can have two different
internal formats. Pointers to void, chars, shorts, or bit-fields, are
PDP-10 byte pointers (local or one-word global); pointers to all other
objects and functions are PDP-10 global word addresses. Byte pointers
point to the byte itself rather than to the preceding byte, thus LDB
instead of ILDB is done to fetch the byte.
It is very important to ensure that functions which return byte
pointer values, typically (char *), be properly declared; likewise, any
arguments which a function expects to be a byte pointer must in fact be
byte pointers, using a cast if necessary. Operations which expect a
byte pointer will not work properly when given a word pointer, and vice
versa. See the section on "pointer hints" near the end of this file
for additional information.
The "NULL" pointer is represented internally as a zero word,
i.e. the same representation as the integer value 0, regardless of
the type of the pointer. The PDP-10 address 0 (AC 0) is zeroed and
never used by KCC, in order to help catch any use of NULL pointers.
ARRAYS:
The only special thing about arrays is that arrays of chars
consist of 9-bit bytes packed 4 to a word, and arrays of shorts have
18-bit halfwords packed 2 to a word; all other objects occupy at least
one word.
ENUMERATIONS:
KCC treats enumeration types simply as integers. In the words
of H&S 5.5 (HS1 5.6.1), KCC uses the "integer model" of enumerations,
which is what ANSI has adopted.
STRUCTURES and UNIONS:
Structures and unions are always word-aligned and occupy a
whole number of words. Unlike the case for other declarations of type
"char" or "short", adjacent "char" and "short" members in a structure
are packed together as for arrays. Structures and unions may be
assigned, passed as function parameters, and returned as function
values.
Bit-fields are implemented; the maximum size of a bit-field is
36 bits. They may be declared as "int", "signed int", or "unsigned
int"; plain "int" bit-fields are unsigned. Fields are packed left to
right, conforming to the PDP-10 byte ordering convention. Bit-fields
are not compacted with anything else; a word in a structure will never
have both bit-fields and another kind of object, not even a char or
short.
It's too bad that C does not allow pointers to bit-fields,
because the PDP-10 byte pointer instructions are perfectly suited to
this application!
FUNCTIONS:
As per H&S. A pointer to a function is simply a word address.
For the gory details of function calls and stack usage, see the
"Internals" section.
TYPEDEFS:
As per H&S. With regard to 5.10.2 (HS1 5.11.1), KCC has no
problems with redefining typedef names in inner blocks; ANSI allows this.
<2 Type Conversions> [H&S 6 "Conversions and Representations"]
Integer conversions:
There are no representation changes when converting any
integer type to any other integer type of the same size. Sign
extension and truncation are performed when necessary to convert from
one size to another. Conversions from pointers are done as per H&S
6.2.3 (V1 6.3.4); a pointer is treated as an unsigned int and then
converted to the destination type using the integral conversion rules.
Floating-point conversions:
Casting (float) to (double) or (long double) retains the
same value. However, (double) to (long double) may lose one digit
of precision, depending on the implementation chosen for (long double).
A cast to (float) of an int may lose some precision,
although a char or short can always be fully transformed. (double)