 The code of an HTML document is composed of tags, which have this form. It starts with an opening tag in angle brackets and ends in a closing tag in angle brackets, and in between you may optionally have what is called content. Inside the angle brackets of the opening and closing tags, you have the name of the tag itself, that is the identifier of which tag is this. In HTML, there are about a hundred different types of tags, for example one called div, D-I-V. So in a div tag, where here we have written tag in italics, you would place that with div, both in the opening tag and the closing tag. Now in the closing tag, you will notice after the first angle bracket there is a slash. That is what distinguishes the closing tag from the opening tag. And then in the opening tag, after the name of the tag itself, you may optionally have one or more attributes. Attributes are name value pairs written like so, where there is the name of the attribute, then an equal sign, and then in quotation marks the value for that attribute. Tag attributes function basically like options, like parameters for that tag. Which attributes a tag takes depends upon the type of the tag. Some tags have multiple attributes, some have none. In some cases they are optional, sometimes they are required. If you do give a tag multiple attributes, it doesn't matter what order you write them in. You can write attributes in any order. So here's an example tag. HTML has a tag called a, standing for anchor, which strangely is what HTML calls a hyperlink, so this tag is creating a link. You'll notice here that the name of the tag, a, is written both in the opening tag and the closing tag. In between we have some content, which is just some text, reading do you like ducks. And then in the opening tag, there's one attribute, href, with a URL for its value, reading HTTP colon slash slash en dot Wikipedia dot org slash wiki slash duck. So this tag creates a hyperlink that reads do you like ducks, but when the user clicks on do you like ducks, it takes them to the URL wikipedia dot org slash wiki slash duck. href, by the way, just stands for hypertext reference. It's a bit of a strange name, but that's what they chose. So this is a complete example of a tag. An important thing to keep in mind, again, though, is that what the attributes are and what they mean, and also what the content is and what it means, that depends entirely upon which type of tag. So this is an a tag, an anchor tag, and so what you put in the content, that's what you're actually going to see is the link that's highlighted in blue and underlined and you click on it. And when you click on it, it takes you to whatever specified in the href attribute. Those are the semantics specific to the anchor tag. Other tags have totally different meaning for their content or for their attributes, and others, as I said, don't necessarily have any attributes and some of them don't take any content. There are some tags where you can't include content at all. And in fact, for those tags, there is a special form that is preferred where you don't write both an opening tag and a closing tag. You just write one tag where the slash comes at the end after the attributes. For example, the IMG tag, the image tag. You don't put any content in an image tag, so you should write it in this form with just a slash at the end. So here is a complete image tag, and the image tag has one required attribute, a source attribute, source written SRC, and the value of the source attribute is the URL pointing to the image file we want to display. Now, for those tags which may include content, most of them allow you to include not just text, but also other tags. So here, for example, is an anchor tag, an A tag, basically hyperlink, and the content here is no longer just a text, you're like ducks. We've also thrown in an image tag. So this hyperlink will consist of not just the text, but also an image. And you'll be able to click on that image, and it'll be just like clicking on the text. It'll take you to the same page. I did mention that HTML has somewhere like around 100 different tags. However, many of those tags have since been deprecated, meaning that they were originally created with some purpose in mind and people later realized that, hey, they're not terribly useful or they're just a bad idea, so people shouldn't be using them anymore. And in fact, in practice anyway, there's only about, say, I don't know, 30, 40 tags which are used the vast majority of the time, and then the rest are used very, very rarely or almost not at all. Furthermore, not only are many tags either ignored or deprecated, the same is true of attributes. Many of the common tags we use have one or more attributes which were useful in the early days, but since have been superseded. Mainly this is because of CSS. CSS wasn't originally around when HTML was first introduced, and over the years CSS has added more and more features that have made a lot of the old stuff, the old tagger attributes redundant. So we're certainly not going to cover the entirety of HTML. We're not going to go over the whole reference. And you should also keep in mind, if you ever do look at an HTML reference, there's a lot of stuff there that it's either going to be explicitly labeled as deprecated, or you'll just find that the best practice is to use the new CSS stuff wherever possible. In any case, tags pretty much encompass the entire syntax of HTML, though there is one more thing, and that is character entity references. A character entity reference is basically just the HTML equivalent of an escape sequence, and these simply allow you to include in the content of a tag characters which you otherwise couldn't. So for example, the less than and greater than symbols, those are used especially to denote the opening and closing tags of any tag. So we need some special way to denote them in content. And the way we write these character entity references is we simply start with an ampersand and end with a semicolon. And the text in between specifies which character entity reference this is. So for example, if in the content of a tag, you wish to write itchy ampersand scratchy greater than Tom and Jerry, then you have to write itchy ampersand amp semicolon scratchy ampersand GT semicolon Tom ampersand amp semicolon Jerry. Those three are certainly the most common character entity references, you'll see, except maybe NBSP, which stands for non-breaking space, which is generally used sort of as a cheat, as a clutch for including an extra space in your text. What I haven't yet mentioned is that when you write text content in an HTML tag, the white space in the content gets collapsed into single spaces. So wherever you have a white space character other than space, like you have a new line character, that just gets translated into a single space. And wherever you have multiple white space characters next to each other, they just get collapsed down into a single space. While this works out well most of the time, it is annoying when you want to genuinely have multiple spaces between words. And the cheat that lets you get around this is to just insert some non-breaking space character entity references. The thing to keep in mind, though, is that it's called a non-breaking space for a reason, it's not just a space. What non-breaking means here is that two words split by a non-breaking space won't be used as a junction point for start in the next line. Normally when you write two words separated by a space, if that's near the end of the line, then the second word might get split down to the next line, well, with a non-breaking space that doesn't happen. So keep that difference in mind and also understand that most experienced web designers consider the use of non-breaking spaces to be sort of a clutch, it's never the right solution. If you find yourself relying upon non-breaking spaces, you're probably doing something the wrong way, there's probably a better way of doing it. I wouldn't fret too much about it, though. In any case, moving on, here is a complete example of an HTML document. A proper HTML document always consists of a single HTML tag inside which are two other tags, first the head tag for the header and the body tag in which go all the tags which you actually see displayed in the page. The tags which go in the header in contrast are just things which aren't actually displayed in the page, like, for example, the title of your HTML document. When you go to a web page, you'll notice that usually you'll see a little title on the tab and that's where this comes from, the title tag inside the head tag inside the HTML tag. So if we were to view this HTML document in our web browser, the tab in the browser would read example web page. You can also see here an HTML comment here displayed in green. HTML comments are written in angle brackets with an exclamation mark and two hyphens immediately after and then you have the text of the comment, which can be whatever. And then at the end, you have, again, two hyphens before the closing angle bracket. So just like in the programming language, everything in the comment is ignored. The other thing to watch out here is that just like with the multi-line comments in C syntax in Java in JavaScript, the slash asterisk, asterisk slash, those multi-line comments, you can't nest them, remember? You can't put one inside the other. Otherwise, that screws up the the parser misinterprets that. And the same thing happens in HTML. So you can't put an HTML comment inside another HTML comment. Otherwise, you can have all this text, which actually isn't commented out and gets interpreted as like it's supposed to be a tag, which will probably lead to all sorts of strange behavior. So just be careful not to put comments inside other comments. Now, you'll notice for easy readability, I have indented things such that the HTML tag is up against the margin and then all the tags contained within I have indented by one. And then within those tags, like I said, the title tag, I have that indented by another level. While the style of indentation creates very neat looking HTML, that's easy to scan and read, the problem is that the nature of HTML is you tend to end up with some tags that are deeply nested, they're like 10 or 12 levels in. So you would have to end up scrolling left and right a lot if you were to browse up and down an HTML document. So unlike in code, most practitioners don't strictly indent all of their HTML. They don't have everything properly indented. The rule that carries the data in HTML is just when it comes to indentation, do what you feel like basically. And also be clear that the syntax of HTML is freeform. So in fact, we could put our entire HTML document just on one line. We don't have to strictly separate the closing tags onto their own lines. Putting everything on one line, though, of course, would be quite ugly in bad practice, so I wouldn't recommend that. In any case, here's the same HTML document just shoved up against the left margin. And now what if we actually put stuff in that body so that we have a real page? Well, now in the body, we have some content. We have some text content and some tags. First, there's the text hello there with the exclamation mark and then an anchor tag with its own content followed by an image. So if we open this HTML document in our browser, this is what we should see. Assuming, of course, that the link to the image I have here is still available and still is the same picture. But we end up with is a very simple page and notice how things are laid out here just from left to right. We'll talk more later about arranging elements on the page. Looking now, though, at a much more complicated example here, the New York Times front page, if in your web browser you hit control, you that should bring up the source from the page. That is the HTML, which your browser retrieved from the web server and then interpreted to display the page. So for this example, I can't fit the entirety of the document on one screen, but just looking at the top, you'll notice it starts with the HTML tag. And then first immediately after that, we're starting the head tag, which then gets closed later down the page as you see. And immediately after that starts the body, which is closed somewhere below that we can't see here. And after that is the closing HTML tag. Don't fret if you think it looks complicated. There's really nothing all that complicated going on here. There's just a lot of it. Also note that up at the very top, before even the HTML tag, there's something that looks like a tag, but which actually isn't. It's the doc type, written with an opening angle bracket than an exclamation mark in the word doc type. What the doc type does is it's just specifying which version of HTML this is. A doc type is actually an idea borrowed from XML. It's just specifying which version of HTML exactly this is. Be clear that pages don't have to have a doc type. Most browsers are usually pretty good about guessing which version of HTML exactly you are using, just by they just infer from what tags you have and so forth. If you do include one, though, I believe actually best practice now is to just conclude one that reads doc type HTML. And that's it. Nothing, none of that junk that says public and all that stuff in quotation marks. You don't need any of that.