 Welcome to Computer Science E1. My name is David Malin, and tonight is all about the internet. But first, you'll recall in problem set one, the challenge that was page one of the problem set, perhaps gave a few of you a bit of a panic if you did not take our advice at the top of the problem set to read ahead as you're taught in grade school to read all the questions before you tackle the first. Consider if you did find that a bit frustrating and didn't realize until an hour later that when you turn page two that it was not requisite to translate all the binary, that two or three years ago when we first played this little trick or this little scare tactic, it was actually just random sequences of zeros and ones on the first page because it did not occur to me that anyone would bother taking the hour or two to translate all of these codes. And needless to say, that semester's student body was far more frustrated than any of you might have been this year. So I hope that by spoiling the extra credit in question three, you've at least got something out of it. This is, I think, the last year, though, that I will put my phone number on the problem set. I have been inundated with phone calls this week, people not knowing who they're calling apparently and just expecting some kind of prize, I think, at the other end. But indeed, the extra credit decrypted to my phone number, which is on the syllabus and on the course's website. And the hope was that that's where your answer would come from, but I didn't mind the late-night phone calls, either. So it was good to chat with many of you. So tonight is about the internet. And I introduced this to you by way of two distance students who will soon greet you themselves. We'll talk first about instant messaging, which is a technology that's relatively simple and concept, but has been increasingly popular both in the workplace as well as in personal environments. Out of curiosity, how many of you use instant messaging regularly? OK, so a handful of you. The most popular clients these days are AOL, Instant Messenger, MSN Messenger, Yahoo Messenger. ICQ used to be quite popular for a while, sort of fading out though Europeans and Asians tend to still use it more so than Americans. Google Talk is perhaps the newest of these, and we'll demonstrate that tonight. So first, let's say hello to a gentleman by the name of Brian if we can get him on the line here. He is logged into a program called iChat, and I am logged in here on a program called AOL Instant Messenger. Everything we do is freely available on the internet. All of these accounts are free. The software is free, so let's see. By double clicking on Brian, who I've added in advance to my buddy list if we can get him on the line. Hello, Brian. To contextualize this a little bit, what I've effectively just done, having logged in as screen name CSCI-E1, which is the free account I signed up for, I have sent a message to Brian by double clicking his name, entering it into this window. That message behind the scenes was transmitted to AOL's server. It was then forwarded from their server to Brian's personal Macintosh, where he received my message, and now we've just received his. So fortunately, you can turn these sounds off because they tend to get rather annoying back and forth, but we are now live if you'd like to say hello. I'm going to blow this window up so you can see a bit better. I've increased the font size. I would not do that normal conversation since it's the equivalent of bad netiquette, as we'll discuss today. But let's see if you'd like to say hello by voice. Among the features here are not only these text transmissions but also video. Let's see if we can get him on the line. So my laptop here has a tiny little webcam of internet camera built into it. Brian's does too, so what I've just clicked on is the video button. The software sent an invitation, a little pop-up window to his software, where he can say, would you like to talk with CSEI-E1 via webcam? He has just said yes clearly. Hopefully we'll soon see an image from him. If I click on this My Camera tab, we can see, though not quite yet, we should be able to see myself and see how long it takes to come live. Brian, can you hear us? No? You can also see when he's actually typing. So he said, no. Let's try again. Could you try connecting to us? There's unfortunately a bit of fidgeting sometimes required with these technologies for reasons of firewalls and so forth. When Brian and I tested this this weekend, it actually worked well when he initiated the request. So I've just gotten a message. Do you want to accept this invitation from Brian? I'm going to go ahead and click Accept. Now cross my fingers that this demo actually works. Whoops. On it. Try again. So we're not behind a firewall here on campus. So hopefully this will, in fact, work better. There we go. Hello, Brian. Can you hear me? Yeah. Fantastic. Well, unfortunately, you can't quite see everyone. Hopefully you can see me. I'm going to click to my view. This is, you can see me as well, Brian? Mm. All right, it's an imperfect technology. But we can see you, and awkward as this might be for you, if you'd like to say a few words to the class, everyone's staring up at you on the big screen. Indeed. Where are you calling us from, Brian? Phoenix, Arizona. So that's pretty good. Not very high latency. And do you use instant messaging video, particularly with many people? Interesting. Well, it's working quite well. Let me ask our audience if anyone would like to ask a question of you. Yes? No? Yeah, one question. How is the weather there? How is the weather there was the question. Nice, right now? Well, we'll let you get back to work. We're going to call someone up now on G Talk, Google Talk. But thanks very much for tuning in. You can watch this recording later this weekend. Goodbye. All right, very nice. So now we're going to call up another student by the name of Ken. We're going to use a different instant messaging client that was recently released by Google, similarly free. It's more in its beta phase. And it's got many fewer features than AOL Instant Messenger, which frankly I think is a good thing. It's quite simple. You can use Google Talk simply by having a Gmail account. And I similarly have a little talk message here. And I'm going to say hello, Ken. And see if we can get him on the line. All right, so the motivation for Google Talk is that even though that worked rather well with Ken, particularly with Brian, although only in one direction, Google Talk, of all of the clients that personally I've experimented with, worked incredibly well. For instance, case in point, just last night, I was talking quite late at night, but quite early for him with a friend who is currently working in Dubai, in the United Arab Emirates. And it sounded just like a typical phone call might. And it costs zero. So it is quite an impressive technology. Let's see if we can call him up indeed. Hi, is Ken there, please? Hey, Mr. Ken, how are you? Ken, you sound brilliant here. The quality is really good. You've got the whole class here listening to you. Well, that's fantastic thing, only slightly embarrassing. Well, you don't have video, which Brian just had for us. But would you like to say a few words to the class about yourself and where you are? Sure, I'm in Alexandria, Virginia. I'm starting a new. An umpire, yes? Well, excellent. Well, let me turn to the audience and see if they have any questions for you, Ken. I'm sorry? What did you last umpire, Ken? Yeah, the last umpire. Oh, well, it's fall ball. I don't do professional games, do anything from a lower level literally going there to go to college. I guess the last one was. Well, excellent. You just want to be real out to do it. Well, we got a lot on tonight's agenda. But thank you so much, Ken, for tuning in. This sounds great. All right, we'll talk to you soon. Cheers, my podcast. So as I think just the mere quality of the audio attest, Google has done a really nice job to their credit in getting this to work. And more so, I've found that Emerson Messenger and Yahoo Messenger and AOL Instant Messenger, Google Talk thus far in my experience works the best behind firewalls in particular. A common problem for people trying to do internet telephony is that if you're sitting at home behind your home router, a.k.a. firewall, and the other person is doing the same too often with these other clients, can you not initiate the connections, essentially because you're both hidden behind these home routers. Google Talk seems to have transcended that problem particularly well, so I would encourage you to try this out. And Gmail is still officially in beta form. Gmail is just a freely available email account, like Hotmail or Yahoo Mail. It's only by invitation only. So you have to ask someone who has a current Gmail account if they can send you a free invitation. The E1 account now has 20 or 100. So if you have no friends with Gmail but would like one to try this out, just drop us a note and we'll send you the invitation from our account. So it's pretty wild. More interesting than just the demonstration of it, I think is a discussion of how technologies like this work. And so what we're going to do tonight is focus more on the application layer on the internet, what you can do with it, how it works, how you would generally go about setting up some of these services like websites. Next week, we'll take the hood off of the internet, just as we did with hardware. Look underneath it all how actually the data has been getting from us to Brian and back and to Ken as well, talking about things like routers and TCPIP and firewalls and the like. But first, I think in appreciation of the effort you put into your problem sets one, I think rather than the dum-dum pops that we started the course off with, you at least now qualify for some nerds candy instead. So I'm going to pass this around. Feel free to engage in a bit of a treat as we proceed. And with that said, why don't we let's have a chat about the internet. There seems to be a particular interest this semester in all things security related. So rather than start with email in the form of the two field and the CC field and the subject field, which might be a bit mundane, I thought we'd start with some of the emails I've been collecting over the past few months for the purposes of this course, specifically so that we can look at email through the lens of bad guys and looking at things like spam and at phishing attacks, such as the ones you're about to see here. One of the earliest emails I received back in, well, this was copied over, let's see, just a few days ago, actually, this PayPal email. What I've done is I've connected to my email account, the courses account, and I have pulled this up with a program called Outlook, which many of you probably use. A first interesting question to motivate our security discussions ultimately would be this first one of why is there a broken link at the top of this email in the form of that red X? Those of you who use Outlook might often see such broken image icons in an email, but why is it broken? Yeah. Could be related to some unsupported ActiveX control, which is a scripting technology that we'll come back to. Could be another suggestion? It turns out that this is actually a good thing that Outlook is blocking my ability to actually see the PayPal logo that is embedded into this email. The curious thing about emails over the past few years is that no longer are they simple text messages, like they were predominantly a few years ago. Rather, they're pretty much web pages, single web pages sent to you in the form of an email, and today's email programs like Outlook or Eudora or Netscape Mail have the ability to actually display emails as though you had pulled up a web page with a browser. Well, the reason that Outlook is blocking our ability to see this image is because this, the embedding of images, is a common technique that spammers use, that companies use, that fissures use, we'll come back to these terms, to determine if you in fact exist and if you in fact got this particular email. Because if this image had been automatically downloaded along with the email, it turns out the image was not sent with the email. Rather, embedded in this email is what's called a hyper reference to the image. The image exists somewhere on a web server, presumably PayPal's web server. In this email for efficiency reasons, there's just like a little pointer to that image that essentially tells the email program when the user wants to view this message, go download that image and then display it to the user, thereby making it faster for me to download the email because I only download the image on demand and they're not sending out millions of copies of the same image if it doesn't change in the first place. But what happens the moment I go around PayPal's or rather Outlook's security feature and click as I'm doing now, download pictures. Well, the email now looks intact, but what has just happened from a security perspective? Exactly, my computer now by having chosen download image has done just that. It's downloaded the image from the server. Now, if that image has a more interesting name than PayPal.gif or PayPal.jpg, but rather has a file name of something like mail-in at EECS.harber.u.gif. Well, what you've essentially just informed the web server is that that guy that you sent that copy of the link to does exist because he just checked his mail and this is how spammers tend to operate. They will spam randomly chosen addresses. This is how a lot of you with hotmail accounts or even Gmail accounts sometimes get spam even if you have never told anyone other than your mother what your email address is. It's they've randomly generated the address and just by chance, M-A-L-A-N happens to be a real username. And so it got into my inbox. Unfortunately, I have just now informed this spammer or PayPal, yes I exist, so next time why don't you send me your spam in more volume? Because I do exist. You needn't randomly choose me. Now, the more interesting question is, is this in fact from PayPal? Well, it says the following. Looks pretty legit, we've got PayPal's real logo. Dear PayPal member, it has come to our attention that your PayPal billing information records are out of date. Dot, dot, dot. All of it looks pretty official. In fact, it looks quite like PayPal's own website and it's asking me clearly to click here to activate your account. All right, seemingly innocuous. It appears to have come from who? Well, service at paypal.com. That clearly looks legitimate. But notice what happens, at least without look, if I just hover over this link. Well, you see in this little yellow pop up the true destination that I will end up at if I click this link. Now, unfortunately, even that is not glaringly worrisome because it appears to be taking me to HTTP colon slash slash some number slash US slash account verification. Still looks pretty legit. If cryptic, if over one's head, but unfortunately that address, that number does not belong to who? PayPal, in fact, if we click on this thing, thereby giving me more spam tomorrow, we have just arrived at what appears to be PayPal's website. In fact, it's almost a perfect copy of it. But notice the URL. It doesn't say paypal.com, but it could be PayPal's address, but the fact that they didn't use www.paypal.com suggests, or should suggest to you after tonight, not PayPal. This is some random guy on the internet that apparently hasn't even been shut down yet by the feds or by whatever country it exists in this computer because this website is still operational. What these folks have done is made a perfect copy of PayPal's site. If I now type in my email address and then my password hits submit, that information's not going to PayPal. It's going to whatever trickster is running this website. With your username and password, guess what he can then do? He can then go to therealpaypal.com, log in as you with your email address, your number, and then spend whatever funds happen to be in your PayPal account. For those unfamiliar, PayPal is sort of like an online bank account that allows you to pay for things via credit card or via cash that's sitting in your account. So this is equivalent to giving someone the pin to your ATM card but in the virtual world. So what might the lesson be from a demonstration such as this? For you, the home user. All right, go to the official webpage always if you want to pull up such an account. In fact, many companies these days are finally getting smart about not including links in their emails encouraging customers to follow them because that renders their customers susceptible to exactly this. This is called a phishing attack where essentially someone is trying to hook you by sending something that looks legitimate. They want you to click and respond but they're really just trying to lure you in under false pretenses. Now a lot of companies are still rather foolish about this like Citibank, for instance. A real bank does tend to use, at least as of recently, real links in their emails. Why? Well, this is useful. It means I can click on, for instance, an email informing me that my latest statement has posted and then I can very quickly log in and I don't need to open up Internet Explorer and type in HTTP colon slash slash www.citibank.com but companies have been teaching customers to do this which is bad because now you have people trying to prey on that same training and attack people in the same way. So to clarify the lesson, well one thing if you do get such an email that's fine, maybe it is legitimate but pull up the website yourself. Manually type in paypal.com and you're much more likely to actually then be safe. In the email? Well so this one too. As you'll see when we start making web pages what you show the user and where the link goes are two different things and it's merely a deceptive coincidence that the aesthetics look like a URL of a website but really it's going to lead you elsewhere. But you have to beware too because a smarter fissure might do what? Well they might embed some actual links in this email. For instance had they been smarter they might have let this link go to the realpaypal.com having only the yellow link go to the bogus site embedding there all the more credibility thereby sort of convincing the user yeah this is legitimate but consider the cost of sending these emails out as you've probably read sending spam costs pennies to email millions of people effectively. I mean the marginal cost of spamming one more person is zero which means if you send out a million of these emails well it's fine if most people delete them even if a lot of the addresses are bogus because consider if just 1% of those emails reach real users that actually click on them. 1% of one million is what 10,000 people's accounts have you just accessed and that's pretty good for a day's work that didn't cost you anything. So let's take a look at another. The next email that I received somewhat recently looked like let's see well this let's take a step back make it a little less scary. So how about these emails that you might get maybe a nice cute note even written by your friend and then something dot dot dot send this to everybody you know send this to everybody in your address book maybe even disclose that you're a doofus and put your name in the email and then fold it along. I mean what are these the equivalent of? These are 21st century chain letters they just don't happen to cost 21 cents in stamps anymore but it's the exact same thing and if there's one thing we teach you tonight please no longer be one of the people that takes these emails at face value and forwards them along even if it's the dying wish of a young cancer patient in a hospital who just wants to see this email go around the whole world and get back to him. I mean to make fun of things like that because it's exactly the same kinds of chain letters that have been going around for years. When you get such an email it's nothing official it's nothing organized it's just a bunch of people frankly with a little too much time. With that said the risk of having just offended half the folks of you who like your internet forwards that's all they are. All right more security conscious perhaps. What about this one? So this one looks like it's from Amazon. Outlook has again protected me by preventing the downloaded these images let's go ahead and disclose that that is actually me. Now what would you do with an email like this? Less confusion here there's just text. So how do you determine if this is legitimate? Using Outlook if you happen to use it. Well again you can do this simple hover trick. Fortunately Outlook does do this and this one's going to a website where? In Japan.jp means Japan so whoever's hosting this website is outside the control of the US authorities which suggests incidentally that a lot of the discussion these days in politics about making phishing and spamming illegal it's all really rather a waste of time because you can regulate Americans perhaps but no one offshore and this will not stop the problem. Case in point you're going to a website that exists in Japan this was as of October 1st again goes to a real website no one has shut this one down yet if I enter my Amazon information and my Amazon password bam I've just disclosed that information and appreciate that with that are sometimes stored things that are more useful than your mere Amazon account name and password but the last credit card you used so that someone in Japan might be able to log in use the last credit card you used to buy something. So again recipient beware and in fact we can be so specific as to know the username of the guy who's running this website because almost always anything after a tilde as in the small URL up top D-A-I-S-U-K-E well that's the username of whoever is running this particular website. It's not hard to figure out who they are but it's tougher sometimes to actually shut them down and find out who to contact to shut them down. Well we can do this all day long let's pick one more representative ones as you see I have one here from Chase Manhattan I have one from another from PayPal Semantic Security, eBay, Citizens Bank, Kate McDonald, McDermond, how about we pick on this one. So this is clearly one with a lot of pictures let's go ahead and download them all. This was sent to me as a valued customer from a company called Semantic same company that makes Norton Utilities which we discussed two weeks ago and they also make antivirus software which we'll talk about a few weeks hence. This is telling me that my antivirus software has expired so clearly I should follow some link and download the latest version that may be true in some cases but most companies these days would not solicit you directly and have you follow a link in an email. Similarly if you've received emails ever from Microsoft announcing the newest bug in their software or the newest virus threat plaguing the internet please download this Microsoft patch. Microsoft is not so benevolent as to email you the customer when something is wrong with their software when you receive emails informing you to ever install software on your computer. Do not do it it is almost always someone trying to dupe you in case and point if we follow one of these links let's see where we end up available via live update so live update is a Semantic product but Semantic does not run I would bet Blue Hornet.com so again beware. So let's take a step back we've dived into a security oriented discussion of email I don't think we're in a day and age where we need to spend much time talking about what email is but what about how it works behind the scenes so when you actually click send on an email where does it go how does that work well here's me sitting at what I'm gonna call my client a client's computer is usually one that is doing the requesting or the sending of information unsolicitedly and we're gonna contrast this with take a guess what kind of computer yeah so a server so on the internet and then the computer world in general there's this inherent sort of relationship between clients and servers where the client like a customer in a restaurant is typically requesting information or providing information to a server and the server then responds either with an acknowledgement that it received the information or with the information being requested itself so I click send in my email program here what happens to that email well it does get in fact get transmitted to a server like this the type of server that it's typically transmitted to though which is of more relevance to you a user is usually called an SMTP server simple mail transfer protocol well you might have seen this if you have like a Comcast account or a Verizon account and you need to configure your Comcast.net email account for the first time well among the pertinent pieces of information a company an ISP will give you is what is the address of your SMTP server what is the address of your incoming mail server well suffice it to say that an SMTP server as the arrow suggests is an outgoing mail server by contrast if I want to receive my emails in other words I sit down at my computer late at night having been at work all day and I want to download all of my personal emails to my computer well I'm going to make that request of another type of computer and that's usually called one of two things it's either a pop server or an IMAP server the distinction being pop is probably the more common a pop server post office protocol is just an email server that when you tell it give me my email it sends all of your emails over the internet down to your computer and then usually but not always deletes the emails from the server so that the only copy that henceforth exists is on your client computer you can change that but eventually if you leave all your emails on the server you'll run out of space and then typically you'll have to deal with that in some fashion an IMAP server by contrast sort of increasingly popular and I think frankly for anyone who checks their mail from multiple locations this is the ideal because what an IMAP server does is it's similarly an incoming mail server I the client would request my email from an IMAP server just as I would a pop server but an IMAP server synchronizes so that if I delete an email on the client it simultaneously gets deleted here if I for instance send an email from my client the copy of it also gets saved here now the beauty of this sort of system is the following I can sit down at my home check my email and then quit my email program then I can go off to work log into my same email account and even if I read all of those emails originally on my client they're still stored on the server but anything that I deleted on the client was similarly deleted on the server so that I have this unified view no matter where in the world I am of my inbox contrast this with the nuisance that is more common today of if you download your emails at home but then you maybe via webmail connect to comcast.net to check your mail from work well sometimes you're sort of out of luck because if you want some email that you read that morning well if you downloaded it via pop to your local computer it's no longer there for you or if you have enabled in an option that allows you to leave all such emails there then what you have is this huge compendium of old emails that you then have to sift through just trying to find the more recent ones in short if you have the option these days as a consumer to choose the type of email account you have IMAP is by far more flexible and for those of you in the corporate world or universities that use Microsoft exchange servers it's not IMAP per se but it's the same idea an exchange server it maintains this synchronization now how many of you use free email services like Gmail, Hotmail, Yahoo for your primary email account so a lot of you so is this picture really happening? well you at the client are pulling up what kind of program to access your Yahoo mail or your Hotmail what program would you use to check your mail? yeah like Internet Explorer or Netscape you would use a browser but in this case anytime you send an email or read an email even though it looks to you like it's sitting on your machine it's actually still sitting on a web server somewhere it's sitting on Yahoo or Gmail or Hotmail server but the picture is pretty much the same it's just there's one additional client in the picture which is now you sitting at this client interfacing with what's effectively a server this being Hotmail for instance but in turn when you say check for new mail Hotmail or any server that you're using similarly checks its mail servers has David receive new email stores them here to then display to you but it's the same idea but this is why for web based mail like free accounts you don't have to even worry about configuring it because someone Gmail, Google, Microsoft are maintaining all these details for you but for those of you who have email accounts through Verizon, Comcast the picture includes only from here to the right questions on in what sense I see it sounds it's tough to say without seeing the problem but it's quite possible that your desktop client was configured differently in such a way that the sent mail was being stored elsewhere or simply not at all but one of those questions that you'd have to sort of be there to explain it probably quite unrelated to this that's more likely a client side issue I'm gonna jump ahead in the slides and then come back to these networking comments just so that we can sort of formalize some of the issues related to emails simple as the program itself might be but we'll see next week why some of these semantics are useful so canonical forms of an email address you might guess look a little like yours it typically comes in two form either you have some username at some domain name dot com dot net dot org dot edu well what are these various components called well the username is typically the first part before the at symbol whatever your nickname is on the system well the domain name is something like hotmail or harvard or gmail and the tld what does tld stand for this is the top level domain so the world is organized into relatively few tlds though they are increasing gradually over time by far the most common or at least most recognized today remains which and dot com most people know dot edu people might be a little more suspicious about dot net or dot org or forget that that's what the address is and not dot com but dot com is the most popular but some email addresses have this added subdomain who for instance has an email address that has a subdomain dot domain dot tld address all of you would be the answer for those of you who have set up your fas account successfully or have emailed me or the core staff have emailed us at the forum username at fas dot harvard dot edu well what does the tld signify generally speaking you can kind of guess right dot com suggested to business address sort of historically dot net suggested it was an isp dot org suggested it was something more like a non-profit these days anybody and everybody can have an address ending in dot com dot net dot org there are no policies involved with what domain what kinds of people or companies use those domain names you can buy them at will in fact most companies will tend to buy not just dot com but all of those just so that you don't have competition to similarly name to you dot gov is one of the few restricted tld so those of you who work for the state or the feds would most likely have an address ending in dot gov white house dot gov is a website on the internet but it ends in dot gov of the dot gov is controlled by the u.s. government so only government entities can have addresses or people working for the government can have addresses in that particular domain and there are several others and we'll come back to those actually in a moment indeed that's one of the earliest ones dot ml but not often seen unless you know folks in the military with email addresses yep that's what all the u.s. army service all u.s. military services use well let's take some examples here are any of these email addresses syntactically invalid in other words are these all legitimate email addresses assuming someone you know is sitting behind that email account put another way are there any typos in these email addresses question yeah good question in fact you can have the periods before the at symbol they don't have special meaning like they do after the at symbol they're simply treated like any other character so in fact a lot of people working for companies or universities will have addresses of the form for instance david dot mail in at harvard dot edu or harvard uses underscores so uh... we have harry potter harry underscores potter at hogwarts dot edu similarly uh... simply a placeholder for might what might otherwise be normal context of space in fact what is absent from all these examples just that spaces are not allowed in an email address and for the most part couple of exceptions but for the most part only the characters you see in the email addresses up here are actually legitimate most special punctuation marks are not valid you can dashes are in fact valid that is a legitimate email address and in fact anakin dash skywalker would be again fitting the format the username that's what he would type to log into a system well what about these which of these are syntactically valid okay daffy doc at looney tunes dot com not valid because of the space there would be an error when sending this or it would bounce back to you missing the tld have to have the tld so that is invalid about the third and interesting what do you think you're suspicious what's your conclusion turns out that though the convention typically is to write things in lower case on the internet for email addresses and web addresses it's case and sensitive for the domain name subdomain and tld you can write it all in caps as cnn dot com tends to do for capital cnn dot little com but it's irrelevant it's more of an aesthetic decision it will get to nbc dot com no matter how you capitalize the nbc dot com and typically doesn't matter how you capitalize the username the theoretically some email servers could be picky about capitalization but i don't know of any that are simply because it would be annoying to be so sensitive what about um... the fourth one i should hope so right now you've been emailing me how about the fifth i'm seeing some head shakes why it is it is an address it's actually legitimate though so dot us is a valid tld and what the u s has done with it's uh... municipal domains is they've divided them for the most part into states so there is a m a dot u s for all of massachusetts there is a ct dot u s for all of connecticut a c a dot u s for all of california inside of each of the states then they manage other subdomains so franklin franklin massachusetts is a town in which there's clearly an email server because this is valid and so uh... the teachers for instance at franklin high school which is actually a school i used to work at would have addresses of this form completely legitimate uh... who would be housing the server in franklin the short answer is i don't recall um... most likely it would be someone in the school system or it could be the state itself for instance uh... massachusetts email system is in microsoft exchange server based system so almost everyone in state dot m a dot u s whether your local or working elsewhere in the state are all managed on the same dysfunctional servers uh... that are managed somewhere i think in the boston area it's unclear from an address alone where it would be managed it could theoretically be housed anywhere on the internet but i don't know specifically here but the last one now yahoo might use the exclamation point in their marketing but it is an invalid character in an email address so it is not possible to write someone by a means like that i mentioned this word before this is sort of one of these terms that's only cropped up in recent years this is an example of an email that is bad netiquette why it's all caps okay all caps for whatever reason has come over the years to be interpreted as yelling so when you write in all caps even if it's an accidental caps lock it seems to happen when my dad writes me sometimes it is though the person is yelling and you should generally speaking if i can get back on my soapbox delete the message and retype it normally uh... in lower case if only because uh... that is at least how people tend to interpret such things these days you can see an example here to have spammed which we came across earlier this is one that it's gone around for quite some time uh... notice that a lot of spam unlike the examples we saw earlier they tend to also come from seemingly cryptic or bogus email addresses the reason being either they're randomly generated hence the seeming randomness of them or the user just doesn't want any sort of association with their name certainly and just chooses some random sequence of characters another popular one somewhat germane here if you're getting tired of actually submitting problem sets will you can just go online and uh... order a diploma from uh... this or any other school or by calling the numbers that were once in this email as well uh... but notice just a few years ago this was more like the spam you would get these days much more colorful much more web page based much more interactive if not dangerous right it's a good point unfortunately whatever regulations exist there's no reason the spammers can't mimic them as well creating bogus addresses using legitimate addresses so again all of these congress-based measures to regulate spam and fishing are really fundamentally failed it's the technology that needs to change not the laws unless you're just interested in cracking down on americans but it's too easy and cheap to just go off shore with a server of your own okay it is so this gentleman just said for those of you following along at home but nor in utilities with spam anti-spam protection yeah a lot of these products are pretty good unfortunately they're always one step behind the spammers which means you need to update them frequently uh... for instance on my email accounts and i probably receive a hundred or more spams a day most of them nicely enough and up in my quote-unquote probably spam folder which i glance at for maybe five seconds a day but i haven't seen a false positive in there an email that shouldn't be in there in many weeks and this is because the computer science department manages this particular product that they use and they tend to work well i would absolutely recommend programs like that hotmail and gmail they all have built-in anti-spam technologies unfortunately they don't always work perfectly so it's sort of trial and error and sometimes you do want to check the folder called spam list an email from your mom have ended up in there since your mom mentioned key words that shouldn't appear in emails or this is actually an interesting point if you've ever seen and we haven't seen an example of this have you ever gotten an email where it might be some kind of advertisement but then at the bottom there's always random words like someone was having way too much fun with the source completely unrelated to the email yes well keep an eye out for this if you haven't seen it those are attempts on spammers parts to avoid detection by spam programs because what spam programs will typically do is look for too much of a frequency of mentions of money or dollar signs or viagra these days or diplomas any of these key words typically associated with spam so if you bulk up your email by including some random words from the dictionary literally well that lowers the frequency of the worrisome words and thereby the program might decide yeah this is about viagra but maybe they're discussing viagra i should let it through so these are some of the anti anti spam techniques another is manifest in this example notice how is diploma spelled with spaces that is a trivial attempt to avoid detection because it doesn't say diploma it says the d space i space and so forth all these little techniques come into play misspellings of words are often used simply to evade spam detection mm-hmm it would not is the short answer if you are using web-based mail no locally installed software is going to help you unless you are downloading the email with like outlook or you dora from hotmail or gmail or yahoo because some of those websites allow you not only to visit the website to check your mail but also configure the pop and smtp server and manually download then it would become relevant but otherwise you need to trust yahoo's and hotmail's filters and hotmail in particular is god awful does not work well their spam techniques as an aside i was talking with someone in the extension school the other day who has received one or more personal emails from um... steve balmer and bill gates and bill gates is reputedly the most emailed man in the world such that he gets one million emails a day many of these are probably spam right who better to spam than someone you're annoyed at for whatever reason um... but just the sheer computer science task of sifting through those emails so that he can actually use email as a technology is remarkable and i'm sure they throw both computers at this and low-paid interns at reading through his emails um... but it's a remarkable problem so if you think your spam is bad just consider uh... other figures in the world perhaps other questions on email well this one's always fun i'm gonna ask you for just a moment to rotate your heads ninety degrees and you'll see a non-exhaustive list of these goofy little things that are called emoticons you've probably seen several of them before smiley faces winking faces um... happy faces tired faces the one most germane to this university here perhaps is uh... maybe this one here if you can read it haven't quite seen that one before a little pointy nose so emoticons are simply ways of taking the edge off of maybe that email you otherwise wrote in all of caps but otherwise is just a silly convention that most people have myself included tended to adopt email sort of a representative application one of the most popular applications by far on the internet and one of the earliest ones used back in the day let's now focus for a moment more on the networks themselves we introduced earlier this notion of clients and servers and this is a theme that's going to recur throughout the course really whenever we talk about software or other hardware configurations uh... let's try to qualify again exactly how things are structured on a higher level and again next week we're going to take the hood off of these things and actually look at how all of this is implemented in why that is actually relevant well we pretty much talked about domains in the context of email a moment ago the blanks here for an at-home exercise if you will on what domains you're familiar with and what sub-domains you might have experienced but we've enumerated a few here we mentioned FAS before I've mentioned POST and EECS tonight DCE exists law exists HBS doesn't exist you have HBS.edu the business school has its own domain name they do not exist within Harvard.edu that's sort of a university curiosity but there are many other TLDs besides the .edu's and this is a nearly current version and I thought it would be most catchy if I highlighted the the very most one there the XXX which is receiving a whole bunch of attention in the US media these days it hasn't yet been deployed but it's been proposed as you might guess for any adult oriented websites the other um... TLDs noted here for the most part are all in use the earliest ones include .com .edu .gov .mil .net and .org the others also exist these days coming into more frequent use is perhaps .info or .name um... .ARPA was also one of the earliest ones and then some of these are still in the so-called startup phase they've been proposed or approved but haven't actually been deployed yet that's a good question so you've probably seen some addresses of the form .tv maybe in web addresses but there are no two-letter TLDs on here in fact .us is not on here but they are on this next page you have here a nearly exhaustive list of every country in the world that has its TLD it's intentionally small you needn't get specifics out of this slide but take away that every one of these countries has its own two-letter TLD typically called CC TLDs country code TLDs and every country by convention has the rights of managing its own TLD now the US since as the inventors of the internet years ago pretty much laid claim to managing the .coms and .net and .gov so only the US government uses .gov any other country has domains only ending in .xy where xy is one of these two characters but notice .tv actually comes from what country it's small to read but in the top right .tv belongs to a tiny little southeast pacific country called Tuvalu which no joke is a means of raising revenue sold off the rights to its country code because in the English speaking world .tv has some very useful commercial applications you know m.tv or ed.tv or any TV show these days that wants a website it started to gravitate toward this domain for the mere fact that it connotes television but it's just a little tiny country that wanted to allow other people outside of its island to register domain names and there have been other countries that have started to do that as well uh... but some like .qa in Qatar not nearly as popular among the English speaking world so Tuvalu had the advantage there but all these exist and there exist websites in almost all of these domains in fact you can see such examples even for US based companies if we visit not cnn.com but www.cn.co.jp which country's news I'm about to get yeah Japan if I click install will actually see correct Japanese but for now let's just take a look at what the homepage looks like and it looks for those unfamiliar that is not actually Japanese um but I haven't simply installed the Japanese language pack but clearly an example now the US oriented company actually having domains elsewhere and they're certainly Japan only companies in that domain what tends to be the convention in a lot of countries Japan the UK in particular is they've adopted this .co.countrycode convention or sometimes .com .countrycode convention simply to sort of mimic the approach that the US has taken by naming everything in a .com realm but .co is just shorter for some reason so questions just on these semantics on the jargon related to domain names because yeah so some of the domain names as you've noted do have some restrictions around them and so the ones that are sponsored for instance are essentially managed so that you as an individual could not buy a domain name in .museum they are sponsored in the sense that there's an organization that actually manages the allocation of these names and they would deny such requests many of them though are simply um unsponsored in that they weren't initiated for instance by a private group but they might still have some restrictions .edu has restrictions .gov has restrictions .mil has restrictions and so forth so more fun have you ever sort of wondered for yourself um how do I get my own website maybe it's sort of been a moot point or sort of out of your means but it's actually kind of a cool and amazingly easy thing to do these days how would you go about getting your own domain name me.com it's pretty much it search for it and pay for it and then it is yours now there's a second step to actually using it you then typically have to pay someone to host the domain if you're not running your own little network in your home you have to pay someone to actually manage the domain but step one for a long time was simply a process of going to network solutions .com which by decree was the only entity in the world authorized actually uh register domain names for people a few years ago though this power was taken away from them and distributed among private entities which has been good for consumers in that whereas domain names three or four years ago used to cost at least thirty five dollars a pop now you with competition you can pay two ninety nine for a year's worth of a subscription to a domain name and that is your ownership uh... one more popular one these days and i do believe the website is ridiculously overwhelming with detail is go daddy uh... sort of an inherent hints at the fact that almost all the good domain names are taken so this company called itself go daddy for some reason but they're quite popular their prices are quite good and their interface is quite powerful ignoring all of the visual distractions and examples of how not the design a website hone in on this little box here what we can do is choose now if we wish a domain name that we'd want to register or we can at least check if it exists all right quick what domain name do we want to register tonight okay CSCI E1 all right so i'm just going to type CSCI E1 i can specify which TLDA i wanted in go daddy has the ability to register in any one of these domains for various partnering agreements notice that dot tv is in there dot jp is in there but let's just go with dot com if only because when people particularly neophytes think of websites these days it's the dot com that still communicates that hints that this is a web address not these days no you can without restriction register even yourself in dot org dot net or dot com or a couple of others as well there are no longer such restrictions search i'm gonna say yes and what we get back from here is net turns out CSCI E1 dot com that's usually after tonight that some enterprising student buys this domain squats as they say and then charges the core staff more than the eight ninety five a year to get it back but it is in fact available unfortunately um that is the beauty of having a fairly cryptic username like c s c i e one if we choose something a little more reasonable maybe c s one oh one which you might think canoes you know an introductory c s class odds are this when we get back the response is in fact taken and this is a bit of a hesitate to say the word scan because a lot of registrars offer this you can back order the domain essentially paying the registrar to say if the guy who actually owns this domain forgets to renew it or ops not to renew it year after year let me buy it unfortunately guess what they don't really tell you so many other people have similarly back ordered the same domain name so it's a scam in that i would you're kind of throwing away nineteen dollars and i wouldn't wait twelve months to see if you can launch that new internet business of yours with the domain name you want um it is particularly hard to find good domain names these days but that's one of the motivations for all of these new um tlds to be cropping up and how popular they are and what the internet looks like in ten years will be an interesting thing to watch excellent question um how do you know these registrars as they are called their legitimate to be honest one of the beautiful things about google these days is it's so well done that you can usually google the company's name and if they are legitimate there will be a lot of sites linking to them recommending them if it's not legitimate and it's a scam they might not appear at all or you might see negative discussions about the company so frankly even one of the curious tricks to avoid going to a fishers or spammers websites these days is i used city bank for a while and city bank has this foolish approach of having multiple domain name city dot com city bank online dot com city bank dot com i mean even i as a consumer don't know which is the legitimate one so whenever i wanted to log into my bank account frankly i would google city bank and then i would visit the top choice because of the power of its search rankings i knew that was the actual legitimate one so the same might go for a registrar frankly though go daddy charges typically six ninety five a year tough to beat that price and isn't really necessary to shop around for domain names typically because you can usually just ask someone where'd you buy your domain name and go with them so essentially network solutions among others maintains ultimate control and so there is absolutely a hierarchy so that they're not all operating independently they must first check with the higher authority is it available then they lock it and then they sell it to someone else so this would be step one and you would simply enter your information and go daddy i can see it is confusing but it is a really good price ultimately you would get an email saying you now own csc i e one dot com i would have chosen to register this user name and password because what i would then need to do is sign up for what's called a web host no web host is just someone who operates a one or more servers they're going to give me a bit of disk space and an account on this server and what all i need to do then is simply log into my registrar account for instance go daddy's account and i need to tell go daddy who is hosting my website and i tell them this by giving them the address the internet address like my host dot com of my host and then whenever someone on the internet no matter where they are in the world looks up csc i e dot com what essentially happens and we'll elaborate on this next week is my laptop for instance would query someone like network solutions say who manages csc i e one dot com they would return an answer my laptop would then make a second request for the actual location on the internet of my web host provider and i only recently started tinkering around with some websites managed by my myself and one curiosity one thing that i think is sort of fun to pass along is that even cost you that much per year if you wanted tonight to go home and pay six ninety five for the domain name five dollars for the year to host your website you can be up and running on the internet with your own website a static website under twelve dollars this is one site that if you'd like to play with it i would recommend for the reason that they give you as you see in the column that's green for five bucks a year you get fifty megabytes of space you get access to email i believe you get up to five gigabytes of transfer over the internet which means people can download a lot over the year or you can pay ten dollars a year and get even much much more it's amazing what you can get for your money these days let's take a five-minute break and we'll come back with more alright we are back so again tonight is about so-called application layer of the internet things that you can do with the internet the web is certainly one such thing email is one such internet service or internet application the web is another and it's quite common these days for people to conflate the worldwide web with the internet but they are not in fact the same thing what is the distinction as you might understand it today between the internet and the worldwide web part of the internet okay let's expound on that it is parked or it's run it's a i don't want to put words in the mouth okay fair enough the fact that it's world it's it's name is worldwide sort of connotes the fact that it is the internet so the internet is by far the broader concept in the internet itself is really the physical infrastructure on top of which various applications and services run so just as email is something you can do on the internet so is the web something you can surf on the internet it's again sort of at the application layer it is not describing physical hardware interconnections and simply for describing an application or service that runs on top of the so-called internet and as you pointed out the internet was really developed in the nineteen sixties by the u.s. military its original name was the arpa net advanced research projects association that later named arpa for defense projects and it was designed to allow various government institutions in later universities to intercommunicate it wasn't really until the nineteen nineties that it sort of caught on as a worldwide commercial and sociological phenomenon but it absolutely had its origins in the united states which is why more to the point that the u.s. continues to control the dot gobs the dot coms in a sense uh... or at least most businesses based in the dot com tl d happened to exist in the u.s. though that is not a requirement there do exist as we've seen dot c o dot j p but foreigners can absolutely register names in dot net dot org and dot com these days simply because it's more of a powerful brand the tl d the dot com then something like c o dot j p might be as well well as the sort of notion of a web implies there are a whole lot of interconnections underlying this and all other services and the reason for that is that the internet was designed again with the sort of militaristic uh... mindset whereby if part of your network is knocked out for various militaristic reasons you want the rest of the network to continue to function which is why the internet today is relatively quite resilient the significant outages even if routers as they're called and will discuss next week art out for electrical reasons for flooding reasons whatever the internet tends to adapt pretty well to such breakages in parts of it because you have many different paths from a to be typically you only use one such path but other paths to exist and we'll see in a real sense some of these paths next week and actually trace data from here to japan to the u k perhaps and just on the street to summer bill or to cambridge but the web for tonight is just an application that runs on top of the internet just as email addresses allow you to address individuals by way of email addresses on the internet so these things called urls allow you to address websites on the web well the canonical form of a web address or more properly a uniform resource locator url is the following some protocol colon slash slash then the machine that you want to access content from and then the path the machine would be described how how we see ever with the tlb with the domain name maybe with the subdomain every computer on computer on the internet has an address actually a numeric address which is a detail we'll focus on next week but for the most part in so far as humans are concerned their unique names like w w w dot cnn dot com or just cnn dot com domains and tlds and subdomains the path would include what everything after the first the rightmost slash we need by path exactly the specific file or the folder or directory on the web server whose content you want to access meanwhile protocol at the left is almost always so far as you know it what http the protocol in generally speaking simply a language that two computers speak so in fact we've seen protocols tonight when i client request my email from an email server what we're speaking is a language or protocol called pop or imap when i transmit an email to be delivered elsewhere in the world to an smtp server more specifically i am communicating with the server using the language or protocol called smtp it is specific to the internet and it's specific to the type of service you're trying to access so it's essentially a language that both my browser or my email program know how to speak and a language that the server knows how to speak from the user's perspective you don't care what the bits actually look like going across the wire all you care is that you get back the day's news or you get back the day's email but what's going on behind the scenes is sort of a standardization of two computers speaking some common language that the human himself does not need to speak now most of the time when you visit sites on the web it's http colon slash slash but a url even though nine times out of ten it refers to websites can refer to any internet protocol you might even have seen protocols like ftp file transfer protocol this is just another type of service that the internet allows for that allows you to transfer files from client to server or vice versa so if you wanted to access such a server download or upload files you could do for instance ftp colon slash slash say fas dot harvard dot edu there exist other protocols ssh is one that you'll see in section and workshop telnet is an older one that's no longer used really because of its insecurity gopher was a really old one used ten or twenty years ago smtp properly speaking is a protocol but you wouldn't really address it with a url like that so suffice it to say a url is something much more general than what we typically use it for it's almost always used in the form of website addresses here might be some representative examples all of these are valid urls but notice not all of them are http the bottom one like my example above suggests that ftp dot food dot com is a server that is running an ftp service it's a server with which i can download and upload files let's take note of some of the capitalization issues is there any fundamental difference between examples two and three capitalization will they bring up different websites will one produce an error no difference uh... domain names tld subdomains are case insensitive it does not matter what about the difference between example one and example two it works but almost always the short a question often asked is must a website start with www the answer is no must uh... why is it typically there convention people have come to expect that writing www means this is a website and this was a useful thing particularly a few years ago when your mother might look at a www dot cnn dot com like what is that but over time the www came to connote uh... this is a web address prior to that a few years ago you would see ads not just containing www dot cnn dot com but http colon slash slash www dot cnn dot com why have we migrated away from even advertising the htp colon slash slash it's too much stuff right it's too much for mom to remember and i my mom's actually rather savvy so i don't need to pick on her i mean the mother in the general sense but it's too much for the consumer to have to bother remembering even though to visit a website it is required to specify the protocol most browsers today they put it in for you you just type www dot cnn dot com hit enter what happens well the url very quickly changes to prepend http to it only because the browser is assuming that is the protocol you want but it is not required to be http it's just a convenience whether or not www dot cnn dot com and cnn dot com lead to the same place is completely dependent on the system administrators it is not a guarantee that both will lead to the same place or even that both will work in fact there are many sites on the web where cnn dot where something dot com does not work for some reason and you actually have to put in the www which is a nuisance good question the browser would only automatically insert it if you did not specify explicitly it won't change the protocol you specify good question though more so even it's been around for quite some time if you visit a website that's whose address starts with https the s does does imply security specifically it implies that a protocol we'll come back to this in our security lectures called ssl is in use in a nutshell just means that all the traffic between your browser and the server is encrypted it's scrambled so that this is a good thing for credit card information user names passwords and so forth that's a good thing to see typically in a website address especially if you're inputting personal information another question was back here no no longer which of these is syntactically valid normal people infrequently students in this course more frequently soon and you'll do that in section uh... but the commonplace protocol is http for the masses no we will use a specialized client that makes it easier the browser it's an excellent question do the slashes are they pointing in the wrong way doesn't matter what does the audience think doesn't matter it does matter officially because that is in violation of the canonical form we just saw browsers these days are forgiving they sort of expect that maybe the user doesn't know quite which one to use the browsers will typically fix them but properly speaking the first one is invalid because the slashes are simply in the wrong direction what about the second one yeah it'll work because the browser is forgiving if we want to be nitpicky it is not a valid url because it lacks the protocol but it will work if it's a website that you're trying to visit what about the third one it ends with html worrisome that's actually okay in fact most pages on the internet do in fact end with index dot html case in point if i go to cnn dot com we get the day's news but if i go to cnn dot com slash index dot html guess what i get the day's news so in fact the default file name that is assumed when you visit just cnn dot com is in fact index dot html and when we start to do our own website development in this course with your fa s accounts that is precisely the very first web page that you will make a file called index dot html but by convention you usually don't have to type it because in the absence of it's being there it is assumed to be the file that the user actually wants so it turns out that number four is also legitimate even though it has a question mark and an equal sign and an ampersand you would typically see urls like that in websites that you're interacting with for instance if i pull up yahoo and search for pink flamingos notice what happens to the url it gets pretty crazy pretty quickly but you see similar features question mark p equals pinks plus flamingos ampersand essentially all of that crazy syntax is a way of a browser passing additional input into a web server and it happens to be visible to the user and it happens quite frequently but you rarely have to type such cryptic things it all happens automatically finally the last two valid or invalid i see some heads shaking why the second to last is missing the tld is invalid and last it's missing a slash so it too is invalid so during break it was mentioned the following search and i would be remiss and not conveying this to all of you as many internet forwards have done so in a lecture about the web have you ever googled for instance miserable failure so if you google miserable failure as you know with google typically the top hit is the most important one the most authoritative one the correct website this is not a political slight on google's part this does in fact lead to white house dot gov and specifically to george w bush's website his personal profile so how in the world is that possible assuming there's no shenanigans going on at google could be a matter of redirection but in fact this is what you see is what you get in fact google is giving you the legitimate white house dot gov address and also pointed out earlier is uh... don't visit white house dot com by accident though rumor has it and i won't verify this during class while we're rolling film is that it was taken down recently uh... so that used to make for a fun demo but not since we broadcast over the internet so it was not a dot gov site let's just say but this one how's this working but it's for sale yeah indeed so there's a couple of tricks that google uses one of them is this move to front heuristic the more you click on it the sort of more popular it becomes but also what people have done in this is a true collusion effect is a lot of people on the internet who maintain websites have created hyperlinks in their web pages who the text of which is uh... miserable failure the hyperlink for which is white house dot gov slash george bush so actually similar to what the spammers are doing by putting their address but associating with the name paypal these guys have simply all gotten together and said hey let's associate miserable failure with the url for george bush's web page and enough people got together and do this and now i'm sure even more people do it because hey look how fun this is so the rankings just get stronger and stronger and in fact another fun one and you gotta look closely here is something like we seem to have a lawyer in the audience how libelous this is i don't know miserable failure is pretty innocuous i suppose if you had die die die linking to someone that the agencies would get more involved frankly but this george bush if you want it to be crafty could go ahead and redirect that web page to be some dead link but i'm sure people would be creative enough to come up with the new link so unless he wants to rip his profile down all together he probably probably doesn't really understand what's going on anyway frankly but if he wanted to care i think he would get more bad press frankly if he tried to fix the problem but another one in the same spirit and no judgments here in e1 political politically neutral as we are if you google weapons of mass destruction some other folks on the internet have had a lot of fun with this all has this changed weapons of mass destruction this actually has changed all how fun weapons of mass destruction google hack all right you didn't used to have to type that all and now that's gone to all this is a shame alright let's see there's a link to the old one we might have to but googling for it would be a little trickier there weapons of let's see if someone left uh... it's just too fun to pass up standard weapons of mass destruction all this is a shame essentially and i can't do it justice verbally have you seen the web page that internet explorer usually shows you when a page is not found it'll say page not found and it'll give you all these cryptic things will essentially someone made a page that look just like that and said sorry weapons of mass destruction not found and then each of the bullet points they were no longer technical tidbits it was actually rather amusing comments about weapons of mass destruction tragic that it's no longer the top i don't know why that happens so another day i will google around to find it but we still have a couple more services to go besides the worldwide web one of which you actually see in use in section for a good amount of time this semester because we're going to use this to develop web pages ssh which just means secure shell is another one of these internet applications this one doesn't let you put web pages it doesn't let you send emails it allows you to control an account on one machine when you're sitting at another so specifically what you will soon be doing in sections and workshops or at home if you haven't already is logging into your f as accounts well you've all obtained hopefully for problem set one or shortly will and f as a count a username in f as dot harvard dot e d u when you subsequently perhaps with the teaching fellows assistant sit down at a client machine and pull up a program on a mac called terminal or on a pc called secure c r t and we'll introduce these in class you will be communicating with a server this server though is going to be one called f as dot harvard dot e d u what you have on the server is a username of course but associated with that username is what's called the home directory you essentially have fifty megabytes of storage space freely given to you by the university that you can put anything you want in whether it's emails or web pages will use it for the latter and you'll see in clout in section in workshop how to actually create files and directories on the server but for now suffice it to say that what you'll do procedurally is use a program that looks little something like this you will then type in for instance f as dot harvard dot e d u is the host name and will walk you through this again notice that s s h one or there's another version of this s s h two may be selected either is fine and then you're gonna click connect you're gonna get some cryptic message which you can just say okay to for our purposes and then you're gonna be asked for that username so ours is c s c i e one enter it's then gonna prompt me with another message for my password and i'm gonna type in our password and then i'm gonna get this blinking prompt now at first this might take a little getting used to since it's a little retro if you will actually typing all of your commands and not so much using the mouse and what you'll find for instance is that when students for instance try to use their f as accounts for email you know you find that they say oh compose message doesn't work in this age you have to with a unique system or a linux system as this is the operating system known as linux only the keyboard commands will work so if i wanted to use my f as account to send mail as you'll see you could use a program called pine though typically folks in this course wouldn't bother with these accounts and then you can just type in email but rather primitive but what you're going to do with these accounts ultimately it's not so much email unless you wish but is the following for instance you're going to create and this is going to be web page one web pages one on one in ten seconds you're going to create a file initially with a program called pico called index dot html you are going to type let me make the font bigger you are then going to type don't start the clock yet now html maybe body this is my first web page you're then going to type a little more body with a slash another html with a slash you're going to then save it you're going to type a cryptic command you're then going to go to a web browser and visit www.fas harvard.edu slash tilde username slash the name of the file you just created and there's your first web page that might not happen that quickly in class hopefully the web pages will be a little more interesting but that is what making web pages can be about it's a different there are different ways of approaching it this is one of the techniques that you will use but you will find that there are tools these days that make making web pages a little more fun a little more interesting than just using these terminals but it's with these terminals that we're going to take the hood off of web pages and actually show you how they work how you build them and how you craft them and how you fix things yourself in fact though you just saw a terribly simple example of a web page it's representative in spirit of something even like cnn.com if you tonight go home and pull up any web page and then go to the view menu and go to source or the equivalent in mozilla or any other browser guess what you'll see a whole bunch of scary looking stuff believe me when I say that six eight weeks hence you will understand this and you will be able to write this maybe not as quickly as perhaps I just did and you won't have such complexity and because frankly their web page is a little messy this is not the most pretty of source code but this is the text that makes up cnn's home page there's not much to it but it probably looks like grief to you now but I promise you by courses and it will not and you'll be able to make sites like this sites like this or whoops there goes the courses website we will now remove the file I just made refresh the web page and fortunately the course website is back and I do hope you've noticed that among the little aesthetic tricks the staff has spent too much time on is the logo as your website can do too changes every day and one of the things we'll get to in our multimedia lectures in this course is a challenge where as part of the assignment you will one learn adobe photoshop sort of the de facto graphics program in the world these days but also have you make your own banners for the website also in that same problem set incidentally you haven't noticed already we hold a little competition every semester where by in that same multimedia problem set for extra credit you can design a candidate mouse pad entry in photoshop coming up with the design that you would like to see on your very own mouse pad we will then take a vote with the class and most popular mouse pad as determined by you the students will then go to press in quantity eighty or so and at the very last lecture you'll walk away with your own mouse pad and on the website you can see are the past three years winners so everyone will walk home a lucky mouse pad winner but in the meantime we're not quite there yet questions on ssh or I didn't really even tell you what ssh was did I so you saw me using ssh as I said earlier ssh is a protocol an application that lets you control an account from one machine that exists on another so that is precisely what I did I logged into from my client my laptop the fb s dot harvard dot edu server the files that I manipulated were here but I was clearly manipulating them from here and that's what ssh allows you to do to connect to a machine control it perhaps only via the keyboard but to have the changes appear here not just on your local screen and so that's in essence what ssh is it's secure in the sense that these arrows back and forth are encrypted so everything you do is secure it's scrambled no one who intercepts the transmission can figure out what you are doing there are a couple of other internet services that are worth mentioning even though one or more of them is rather deprecated and you'll notice that tonight's lecture slides are really just about putting cute cartoons on the screen without actual content that will appear in the scribe notes for tonight but blogs everybody's probably heard of a blog these days there's some ridiculous statistic like there's ten thousand new blogs put on the internet every day which means ten thousand more people every day have way too much time on their hands be curious to see how many people how many of those blogs are being read every day but a blog is short for what two words weblog this is a technology if that's even a fair descriptor that's been possible for forty years it means making web pages but the convention is that a blog is sort of your online diary livejournal dot com is one such site blogger dot com run by google is another such site it simply become much more popular for one people just to talk about their day uh... on their blog friends perhaps read these nobody perhaps read these it's sort of a strangely inverse voyeuristic way of sort of exposing your life to the world for others to sort of read at will uh... they've become more in the mainstream these days in the realms of politics especially with the most recent presidential election where you have pundits actually posting their articles online and sort of calling them blogs which is sort of a funny thing because any reporter working for a newspaper could publish an article every day a blogger is simply publishing articles every day but they're calling them blogs only because it's a little more interactive a little less official if you will but signing up for a blog is as simple as getting a free account on any of the sites that i mentioned or googling the term log uh... it remains to be seen how common these things remain ten thousand new journal entries a day that's a lot to keep up with so we shall see instant messaging we've talked about uh... instant messaging though does fit nicely into this arrangement whereby we sitting here at my laptop acting as the so-called communicating with can or brian by way of aol's or google server so really what should be in this picture when i was conversing with brian or ken on this side exactly another client simply that server a i m or google was sort of brokering the conversation and relaying the information from one client to another s ftp again just another on our laundry list here internet application secure ftp we said ftp before spiles transfer you will use secure uh... ftp in sections and workshops so that you can upload files from your local computer to your f a s accounts so that's different for message because you don't really want to control or create the image on the server you probably want to take it locally and move it there for instance if you download an image from the web with a browser and want to upload it to your f a s account you'll use this if you want to edit your html files on your local computer and upload them you'll use secure ftp will demonstrate this in class very easy to use and all it speaks to is this notion of securely transferring no magic use net is sort of a dated technology uh... those who were on the internet back in the day would remember such uh... hierarchies as the alt newsgroups the biz the comp humanities myths news and so forth these were essentially yesterday years newsgroups bulletin boards if you will they're still in use though by a geek your community i would say google actually if you've ever seen the link on their website to something called google groups which is one of the main links few years ago they bought a company called the john news which essentially was just an archiving service for all of these various bulletin boards it's a brilliant repository of arcane information so almost always if i need from a computer science perspective an answer to like a technical question that i just can't figure out and the web by way of google was sort of failing me google groups henceforth for you should be a wonderful resource because odds are if you have a question that's fairly technical if that's a tariff somebody else out of the billions of people in the world have had that question two and at least one of them is hopefully posted the question and other people have answered the question so essentially you have this wonderfully free resource of a ridiculous amount of information on sports on computers on other things really chatty newsgroups but it's very powerfully searchable by google these days ever since they bought the archives people are still posting new messages and you can read new messages by way of google groups but frankly i esteem the search capabilities over all others is a wonderful way of finding out arcane information questions on the internet it's a good question so there are there were a number of pop-ups that i glossed over tonight as i tend to do in typical practice they said a couple of things one of them was that you were changing from a sec insecure site to a secure site essentially many browsers if you change from an address that is at http colon something but click a link that's on https colon something it will inform you that your what you're about to view is going to be over a secure connection sort of a useful but sort of annoying message after a while and so i normally just ignore them similarly will many browsers tell you the reverse if you're on a secure website but you're about to be redirected to an insecure website you know that could be a security concern and so most web uh... most browsers will inform you of that the most users like me will ignore that there are other messages about security settings and the types of files that are trying to be downloaded to your computer will come back to those in the internet and the security lectures but for the most part they're useful if redundant warnings that most people don't understand but we'll see more of them in the future well we haven't yet had our daily video sign fell the side and this one is a teaser of sorts for where we're going next week next week again is the internet continued and will be much less application oriented and much more detail oriented on how does the internet work what is this dhcp server that you need to connect to when you get that new cable modem or dsl modem what does it mean to be a gigabit ethernet card how fast is your dsl connection which should you get dsl cable modem dial up all of these kinds of questions will we pursue will do this by way of a conversation about protocol not unlike these called tcp ip which even if you haven't understood it you've probably at least heard the term and what you're about to see is a teaser trailer for a little video we'll see next week among other things uh... written by some guys from ericsson's lab that i used to work with that gives you a hint at how these things work so restart this from the beginning give you some audio killed the bad elevator music very ungraceful lead into this yes so we will see you next week at the internet continued