5min - Introductions 7min - Network 7min - reactor intro 15min - uppercase 10min - proxy1 10min - deferred 10min - proxy2 15m - BREAK
Twisted
an introductory training
About me
http://orestis.gr - @orestis
About you
Raise your hand if...
Some setup
http://twisted-talk-url TODO
Test
>>> from twisted.internet import reactor >>> from twisted import version >>> version Version('twisted', 11, 0, 0) >>>
Network programming
Network programming
Client connects to port Server listens on a port
Service request
Client reads/ writes data Server reads/ writes data
While servicing...
Client reads/ writes data Server reads/ writes data
While servicing...
Client reads/ writes data Server reads/ writes data Client connects to port
Timeout!
Client reads/ writes data Server reads/ writes data Client connects to port
Timeout!
Client reads/ writes data Server reads/ writes data
Reje cted
Doesnt scale!
Saturday, 18 June 2011
Twisted?
Twisted is a networking engine written in Python, supporting numerous protocols. It contains a web server, numerous chat clients, chat servers, mail servers, and more.
Twisted!
Text
Twisted!
Twisted
Twisted is a networking engine written in Python, supporting numerous protocols. It contains a web server, numerous chat clients, chat servers, mail servers, and more.
twisted.internet
Asynchronous I/O and Events.
twisted.internet !!!
twisted.internet !!!
defer endpoints error protocol reactor task
twisted.internet reactor
the loop which drives applications using Twisted
What if we could...
Eliminate Blocking?
Eliminate Blocking?
Reactor loop
Callback functions
Reactor loop
Event happens
Callback is called
reactor.run()
Run this!
Saturday, 18 June 2011
Run this!
Saturday, 18 June 2011
upperserver.py
Saturday, 18 June 2011
Run this!
Run this!
Saturday, 18 June 2011
A better client
import socket
multiclient.py
def make_connection(host, port, data_to_send): s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect((host, port)) s.send(data_to_send) s.send('\r\n') b = [] while True: data = s.recv(1024) if data: b.append(data) else: break return ''.join(b) if __name__ == '__main__': import sys host, port = sys.argv[1].split(':') data_to_send = sys.argv[2:] for d in data_to_send: print make_connection(host, int(port), d)
Run this!
Questions so far?
Exercise coming up!
Exercise 1
Count connected clients Announce number of connected clients when connecting HINT: Protocols have a factory instance attribute
upperserver_ex.py
Saturday, 18 June 2011
Run this!
if __name__ == '__main__': import sys host, port = sys.argv[1].split(':') data_to_send = sys.argv[2:] threads = [] for d in data_to_send: t = threading.Thread(target=t_connection, args=(host, int(port), d)) t.start() threads.append(t) for t in threads: t.join() print 'finished'
threadedclient.py
Saturday, 18 June 2011
Run this!
A brief recap
reactor runs forever a factory instance is tied to a specific port protocol instances are created for each client implement specific methods to add functionality
Client sends an URL followed by a newline The server returns the contents of that URL Connection is closed
followed by a newline
from twisted.internet import protocol class MyProtocol(protocol.Protocol): def connectionMade(self): self.buffer = [] def dataReceived(self, data): self.buffer.append(data) if '\n' in data: line, rest = ''.join(self.buffer).split('\n') self.buffer = [rest] print line
from twisted.protocols import basic class MyProtocol(basic.LineReceiver): def lineReceived(self, line): print line
twisted.protocols
amp basic dict nger ftp gps htb ident loopback memcache mice pcp policies portforward postx shoutcast sip socks stateful telnet tls wire
twisted.protocols.basic
NetstringReceiver LineOnlyReceiver LineReceiver IntNStringReceiver Int32StringReceiver Int16StringReceiver Int8StringReceiver StatefulStringProtocol FileSender
twisted.protocols
Dont reinvent the wheel!
proxy1.py
Saturday, 18 June 2011
Run this!
timingclient.py
Saturday, 18 June 2011
Run this!
Client
Server
fetched http://amazon.com took 1.45738196373 fetched http://apple.com took 1.01898193359 fetched http://google.com took 0.770982027054
Somethings wrong!
Individual requests 0.770982027054 0.566583156586 1.45738196373 1.01898193359 3.81392908096 Threaded client 3.81683182716
Saturday, 18 June 2011
Eliminate Blocking?
Waiting!
The culprit
print 'fetching', line data = urllib2.urlopen(line).read() print 'fetched', line
The culprit
print 'fetching', line
Forb idde n
Network programming
Network programming
Client connects to port Server listens on a port
Network programming
asks for a Client connects Customer teportnini ato d pa to s
data = urllib2.urlopen(line).read()
Panini stall
Wait my turn Place order Wait for panini Eat panini
Network
Make connection Send request Read data Use data
Waste of time!
We are idling the CPU! Nothing else can run! How selfish of us!
Solution: Callbacks!
Introducing Deferred
twisted.internet.defer
A Deferred is...
A promise of a result... A result that will appear in the future... A result you can pass around... Something you can attach callbacks to.
Deferred Panini
import stall def eat(panini): print YUM! Ive just eated a, panini deferred = stall.order_panini(spec) deferred.addCallback(eat)
So....
import urllib2 data = urllib2.urlopen(url).read() print data
from twisted.web.client import getPage def got_page(data): print data deferred = getPage(url) deferred.addCallback(got_page)
In context...
def lineReceived(self, line): if not line.startswith('http://'): return start = time.time() print 'fetching', line def gotData(data): print 'fetched', line self.transport.write(data) self.transport.loseConnection() print 'took', time.time() - start deferredData = getPage(line) deferredData.addCallback(gotData)
def lineReceived(self, line): if not line.startswith('http://'): return start = time.time() print 'fetching', line data = urllib2.urlopen(line).read() print 'fetched', line self.transport.write(data) self.transport.loseConnection() print 'took', time.time() - start
Python reminder:
def lineReceived(self, line): if not line.startswith('http://'): return start = time.time() print 'fetching', line def gotData(data): print 'fetched', line self.transport.write(data) self.transport.loseConnection() print 'took', time.time() - start deferredData = getPage(line) deferredData.addCallback(gotData)
Python reminder:
def lineReceived(self, line): if not line.startswith('http://'): return start = time.time() print 'fetching', line def gotData(data): print 'fetched', line self.transport.write(data) self.transport.loseConnection() print 'took', time.time() - start deferredData = getPage(line) deferredData.addCallback(gotData)
A tidier way
def writeDataAndLoseConnection(data, url, transport, starttime): print 'fetched', url transport.write(data) transport.loseConnection() print 'took', time.time() - starttime class ProxyProtocol(basic.LineReceiver): def lineReceived(self, line): if not line.startswith('http://'): return start = time.time() print 'fetching', line deferredData = getPage(line) deferredData.addCallback(writeDataAndLoseConnection, line, self.transport, start)
proxy2.py
Run this!
Client
$ python proxy2.py fetching http://orestis.gr fetching http://amazon.com fetching http://google.com fetching http://apple.com fetched http://orestis.gr took 0.486361026764 fetched http://apple.com took 0.850247859955 fetched http://google.com took 0.998661994934 fetched http://amazon.com took 1.58235692978
Server
Much better!
Individual requests 0.486361026764 0.850247859955 0.998661994934 1.58235692978 3.917627811433 Threaded client 1.5850892067
Saturday, 18 June 2011
Exercise 2a
Implement a caching proxy server! Save response data in plain dict Lookup response data QUESTION: Where should you store the dict?
proxy2_ex1.py
class CachingProxyProtocol(basic.LineReceiver): def lineReceived(self, line): if not line.startswith('http://'): return try: data = self.factory.cache[line] self.transport.write(data) self.transport.loseConnection() except KeyError: def gotData(data): self.factory.cache[line] = data self.transport.write(data) self.transport.loseConnection() deferredData = getPage(line) deferredData.addCallback(gotData) class CachingProxyFactory(protocol.ServerFactory): protocol = CachingProxyProtocol cache = {} reactor.listenTCP(8000, CachingProxyFactory()) reactor.run()
Run this!
Exercise 2b
Put those cool features into use! One callback to get a page One callback to store it to cache One callback to write to transport HINT: Use defer.succeed(data) to return a ready Deferred
proxy2_ex2.py
Run this!
BREAK
Write questions on whiteboard
Writing clients
the Twisted way
Remember this?
import socket def make_connection(host, port, data_to_send): s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect((host, port)) s.send(data_to_send) s.send('\r\n') b = [] while True: data = s.recv(1024) if data: b.append(data) else: break return ''.join(b) if __name__ == '__main__': import sys host, port = sys.argv[1].split(':') data_to_send = sys.argv[2:] for d in data_to_send: print make_connection(host, int(port), d)
Remember this?
import socket def make_connection(host, port, data_to_send): s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect((host, port)) s.send(data_to_send) s.send('\r\n') b = [] while True: data = s.recv(1024) if data: b.append(data) else: break return ''.join(b) if __name__ == '__main__': import sys host, port = sys.argv[1].split(':') data_to_send = sys.argv[2:] for d in data_to_send: print make_connection(host, int(port), d)
Remember this?
import socket def make_connection(host, port, data_to_send): s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect((host, port)) s.send(data_to_send) s.send('\r\n') b = [] while True: data = s.recv(1024) if data: b.append(data) else: break return ''.join(b)
Forb idde n
if __name__ == '__main__': import sys host, port = sys.argv[1].split(':') data_to_send = sys.argv[2:] for d in data_to_send: print make_connection(host, int(port), d)
reactor.run()
Run this!
Saturday, 18 June 2011
simpleclient.py
Saturday, 18 June 2011
Run this!
Observations
Data returns with random order We cannot access the returned data Loop never stops Performance?
simpleclient2.py
Saturday, 18 June 2011
reactor.run()
Run this!
Observations
Data returns with random order We cannot access the returned data Loop never stops Performance?
Introducing DeferredList
A list of Deferreds! You create it with a list of Deferreds When all the Deferreds have finished, its callback fires.
Introducing DefferedList
from twisted.internet import defer from twisted.web.client import getPage pages = ['http://www.google.com', 'http://www.orestis.gr', ...] all_deferreds = [] for page in pages: d = getPage(page) d.addCallback(gotPage) all_deferreds.append(d) deferredList = defer.DeferredList(all_deferreds) def all_finished(results): print "ALL PAGES FINISHED" deferredList.addCallback(all_finished)
simpleclient3.py
Saturday, 18 June 2011
Run this!
Observations
Data returns with random order We cannot access the returned data Loop never stops Performance?
simpleclient4.py
Saturday, 18 June 2011
Run this!
Exercise 3
Write a memcached SET command-line script
Run this!
Exercise 4
Make this reusable!
def memset(host, port, key, value): """Return a deferred that fires on successful store"""
Exercise 4
from twisted.internet import reactor, protocol, defer from memset import MemsetProtocol def memset(host, port, key, value): f = protocol.ClientFactory() f.protocol = MemsetProtocol f.key = key f.value = value f.deferred = defer.Deferred() reactor.connectTCP(host, int(port), f) return f.deferred
if __name__ == '__main__': import sys host, port = sys.argv[1].split(':') key, value = sys.argv[2:] d = memset(host, port, key, value) d.addCallback(lambda _: reactor.stop()) reactor.run()
Exercise 5
Write an HTTP GET command-line script
Exercise 5
from twisted.internet import reactor, protocol, defer class HTTPGETProtocol(protocol.Protocol): def connectionMade(self): self.buffer = [] self.transport.write('GET %s HTTP/1.1\r\n' % self.factory.path) self.transport.write('User-Agent: europython/2011\r\n') self.transport.write('Host: %s\r\n' % self.factory.host) self.transport.write('Connection: close\r\n') self.transport.write('\r\n') def dataReceived(self, data): self.buffer.append(data) def connectionLost(self, reason): self.factory.deferred.callback(''.join(self.buffer)) def get(address, host, path): f = protocol.ClientFactory() f.protocol = HTTPGETProtocol f.path = path f.host = host f.deferred = defer.Deferred() reactor.connectTCP(address, 80, f) return f.deferred
Thats tedious!
Sharing state between protocols can be useful Many times we dont need it Theres another way to do this.
t.i.protocol.ClientCreator
Change your protocol to have __init__ Create it with a protocol class and args Connect it to a host:port Attach a callback When the protocol is instantiated, callback is fired
ClientCreator
from twisted.internet import reactor, protocol, defer from twisted.protocols import basic class MemsetProtocol(basic.LineReceiver): def __init__(self, key, value): self.key = key self.value = value self.deferred = defer.Deferred() def connectionMade(self): self.transport.write('set %s 0 0 %d\r\n' % (self.key, len(self.value))) self.transport.write(self.value) self.transport.write('\r\n') def lineReceived(self, line): if line == 'STORED': self.deferred.callback(self.key)
if __name__ == '__main__': import sys host, port = sys.argv[1].split(':') key, value = sys.argv[2:] client = protocol.ClientCreator(reactor, MemsetProtocol, key, value) d = client.connectTCP(host, int(port)) def got_protocol(protocol): return protocol.deferred d.addCallback(got_protocol) d.addCallback(lambda _: reactor.stop()) reactor.run()
Exercise 6a
Exercise 6b
Rewrite the HTTP GET protocol to work with t.i.p.ClientCreator ...and the plain factory way HINT: Implement Factory.buildProtocol to customise the way...
twisted.web
the early days
twisted.web NOP
from twisted.web.resource import Resource from twisted.web.server import Site from twisted.internet import reactor from twisted.python import log import sys log.startLogging(sys.stdout) root = Resource() factory = Site(root) reactor.listenTCP(8000, factory) reactor.run()
Run this!
Saturday, 18 June 2011
twisted.web
from twisted.web.resource import Resource from twisted.web.server import Site from twisted.internet import reactor from twisted.python import log import sys log.startLogging(sys.stdout) class Index(Resource): def render_GET(self, request): return "HELLO" class Page(Resource): def render_GET(self, request): return 'A PAGE' root = Resource() root.putChild('', Index()) root.putChild('page', Page()) factory = Site(root) reactor.listenTCP(8000, factory) reactor.run()
Run this!
Saturday, 18 June 2011
curl -N http://localhost:8000
from twisted.web.resource import Resource from twisted.web.server import Site, NOT_DONE_YET from twisted.names.client import getHostByName from twisted.internet import reactor, defer from twisted.python import log import sys log.startLogging(sys.stdout) from httpget import get from sites import SITES class Index(Resource): def got_site(self, data, site, request): request.write('GOT %s (%d)\r\n' % (site, len(data))) def render_GET(self, request): dl = [] for site in SITES[:10]: d = getHostByName(site) d.addCallback(get, site, '/') d.addCallback(self.got_site, site, request) dl.append(d) dl = defer.DeferredList(dl) def finished(results): request.finish() dl.addCallback(finished) return NOT_DONE_YET
monitor1.py
Saturday, 18 June 2011
Run this!
Timeouts!
Abort the attempt after 30 seconds
Introducing errbacks
Like callbacks, but for error condition Called explicitly - d.errback(reason) Called implicitly, when a callback function raises
errback.py
Saturday, 18 June 2011
Run this!
errback_wrong.py
Saturday, 18 June 2011
Run this!
defer.setDebugging(True)
Calling code
Calling code
Calling code
def s_function(result): if result == "NO": raise Exception(result) else: return result.upper() def a_function(result, d): if result == "NO": d.errback(Exception(result)) else: d.callback(result)
try:
result = s_function(something) print result except: print "OH NOES" def on_success(r): print r def on_error(_): print "OH NOES" d = defer.Deferred() d.addCallbacks(on_success, on_error) a_function(something, d)
addCallbacks?
d = defer.Deferred() d.addCallback(on_success) d.addErrback(on_error) a_function(something, d)
d.addCallback(on_success) d.addErrback(on_error)
getPage
Google Search on_html Doodle or ERROR: No doodle found 500 Error on_google_down ERROR: Google Down print_result
getPage
nd_doodle_text
on_google_down
on_no_doodle
print_result
Saturday, 18 June 2011
getPage
Google nd_doodle_text Doodle on_no_doodle Doodle print_result
Saturday, 18 June 2011
on_google_down
getPage
Google nd_doodle_text on_google_down Exception on_no_doodle
ERROR: No doodle found
print_result
Saturday, 18 June 2011
getPage
500 Error nd_doodle_text on_google_down
on_no_doodle
print_result
Saturday, 18 June 2011
httpget2.py
curl -N http://localhost:8000
from httpget2 import get SITES = [ 'aaaaaaa.nonexistantsiteprettysure.com', 'apple.com', 'orestis.gr', 'localhost', ] #from sites import SITES class Index(Resource): def got_site(self, data, site, request): request.write('GOT %s (%d)\r\n' % (site, len(data))) def got_error(self, failure, site, request): request.write('ERROR %s (%s)\r\n' % (site, failure.getErrorMessage())) def render_GET(self, request): dl = [] for site in SITES[:10]: d = getHostByName(site) d.addCallback(get, site, '/') d.addCallbacks(self.got_site, self.got_error, callbackArgs=(site, request), errbackArgs=(site, request), ) dl.append(d) dl = defer.DeferredList(dl) def finished(results): request.finish() dl.addCallback(finished) return NOT_DONE_YET
monitor2.py
Saturday, 18 June 2011
Run this!
Exercise 7
Differentiate between different errors: twisted.names.dns.DomainError twisted.error.TimeoutError Everything else Have a final callback that writes results to request HINT: Use Failure.trap(error_class)
curl -N http://localhost:8000
def got_site(self, data, site): return 'GOT %s (%d)\r\n' % (site, len(data)) def got_get_error(self, failure, site): failure.trap(error.TimeoutError) return 'GET ERROR %s (%s)\r\n' % (site, failure.getErrorMessage()) def got_dns_error(self, failure, site): failure.trap(DomainError) return 'DNS ERROR %s (%s)\r\n' % (site, failure.getErrorMessage()) def got_other_error(self, failure, site): return 'UNKNOWN ERROR %s (%s)\r\n' % (site, failure.getErrorMessage()) def render_GET(self, request): dl = [] for site in SITES[:10]: d = getHostByName(site) d.addCallback(get, site, '/') d.addCallback(self.got_site, site) d.addErrback(self.got_dns_error, site) d.addErrback(self.got_get_error, site) d.addErrback(self.got_other_error, site) d.addCallback(request.write) dl.append(d)
monitor3.py
Saturday, 18 June 2011
getHostByName
get got_site got_dns_error got_get_error got_other_error request.write
Saturday, 18 June 2011
curl -N http://localhost:8000
def render_GET(self, request): dl = [] for site in SITES[:10]: d = getHostByName(site) def on_get_name(address, site): d2 = get(address, site, '/') d2.addCallbacks(self.got_site, self.got_get_error, callbackArgs=(site,), errbackArgs=(site,)) return d2 d.addCallbacks(on_get_name, self.got_dns_error, callbackArgs=(site,), errbackArgs=(site,)) d.addErrback(self.got_other_error, site) d.addCallback(request.write) dl.append(d)
monitor3.py
Saturday, 18 June 2011
getHostByName
on_get_name got_dns_error
got_site
got_get_error
got_other_error request.write
Saturday, 18 June 2011
Doesnt scale!
You just DOSed your operating system Twisted will happily open the connections You need to ensure you dont overload the system
Many issues
We overload the DNS We saturate the network We exhaust the open file limit We do this for every request!
DeferredSemaphore
limit = defer.DeferredSemaphore(10) d = limit.acquire() def on_acquire(limit): v = func(a, b) limit.release() return v d.addCallback(on_acquire) d.addCallback(done)
DeferredSemaphore
dnsLimit = defer.DeferredSemaphore(5) getLimit = defer.DeferredSemaphore(10) for site in SITES: d = dnsLimit.run(getHostByName, site) def on_get_name(address, site): d2 = getLimit.run(get, address, site, '/') d2.addCallbacks(self.got_site, self.got_get_error, callbackArgs=(site,), errbackArgs=(site,)) return d2
Example: You do not want to have more than 5 simultaneous DNS queries and 10 GETs
We are creating the workload while we are operating on it We seem to be blocking - everything is run inside a giant callback chain
In our case, the limit.release is triggering a callback which triggers a deferred which triggers a callback immediately reactor doesnt get a chance to breathe
twisted.internet.task
cooperate
DeferredQueue
Writing clients
twisted codebase
interfaces/adapters/components/plugins
cancellable deferreds
Perspective broker
manholes?
testing