Scala Symbol Soup Salvage

source: yande.re

By no several pages could one explain or exhausted the absurd bizarre creepy daunting esoteric flabbergasting type system in Scala.
Here are some aspects that Learn Scala by example does not cover.

Digest:

http://twitter.github.io/scala_school/advanced-types.html

Context Bound and Implicitly:

def sort[A : Ordered] => def sort[A](implicit x: Ordred[A])
def implicitly[T](implicit e: T) = e
http://stackoverflow.com/questions/3855595/what-is-the-scala-identifier-implicitly

Type Bound:

=:= <:< <%<
http://stackoverflow.com/questions/3427345/what-do-and-mean-in-scala-2-8-and-where-are-they-documented

Existential Type:

def foo(x: Array[_]) => def foo(x: Array[T] forSome {type T})
http://www.drmaciver.com/2008/03/existential-types-in-scala/

Self Type

class BarUsingFooable {self: Fooable => ....} // yet-to-be Fooable, make this binded to self
https://coderwall.com/p/t_rapw

Structural Type

def Quack(duck: {def quack: Unit})// inline anonymous class like ducktyping
http://java.dzone.com/articles/duck-typing-scala-structural

Abstract Type member

trait Job {type A; def get: A}
http://docs.scala-lang.org/tutorials/tour/abstract-types.html

Type Level Programming:

http://apocalisp.wordpress.com/2010/06/08/type-level-programming-in-scala/

Miscellaneous:

http://stackoverflow.com/questions/1025181/hidden-features-of-scala

Last but not least:

Site significantly serving Scala symbol soup salvage saves spiritually severed souls

QuickNote Descriptor in Python

TL;DR: Descriptor in Python is just getter/setter

Three ways to implement descriptor:

  1. Use a class that implement magic method __get__, __set__, __delete__
  2. Use builtin function property(fget=None, fset=None, fdel=None, doc=None)
  3. Use decorator form of property, like @property def attr(): ..., @attr.setter, @attr.deleter

Make sure all three methods are done at class level rather than at instance level.
(That is, write my_attr = property(...) under class MyClass(object): statement,
but not self.my_attr = property(..) in __init__)

Because Python’s MRO is:

  1. find attr in instance.__dict__ (instance level)
  2. if not found, find attr in instance.__class__.__dict__. If found, try return attr.__get__, otherwise return attr (class level)
  3. if not found, repeat step 2 on instance.__class__.__base__ until attr found or no base class found
  4. if not found, try returning computed attr if __getattr__ is defined

So the descriptor magic is done at class level

Pitfall:

Because descriptor dwells at class level, if one uses descriptor class implementation, all instances may share one common variable.

To fix that:

  1. Hoard one hidden variable in the instance
  2. Use a dictionary in descriptor class to store info of different instances.

The second solution needs the class hashable. (So a meta class is needed to guarantee hashability)

Personal view: To maintain Python’s one simple way to work Pythonists just introduced much more complexity.

Reference:
http://www.ibm.com/developerworks/library/os-pythondescriptors/
http://nbviewer.ipython.org/urls/gist.github.com/ChrisBeaumont/5758381/raw/descriptor_writeup.ipynb
http://docs.python.org/2/howto/descriptor.html

Breadth First Search without Queue

source: かみやまねき

TL;DR: The most intuitive implementation of BFS is using Queue

Just out of fun, I wonder whether BFS can be implemented without Queue.
While DFS can be easily done by recursion, naive BFS implementation just incurs loop mayhem.

The first solution jumped into my mind is to add a depth parameter into BFS function.
The search function only visits nodes whose depth equals to the parameter and skips nodes whose depth does not.
I am not alone

Generator is sweeping NodeJS community (admittedly this is my exaggeration). I’m quite obsessed with generator’s suspension power. Here’s my try in Python.

'''
    implicit BFS, without queue or depth
    incurs infinite circulation on loop graph
    implement for fun (Really need a queue)
'''
def bfs_gen(node):
    '''
    generator function on a node, yield an iterator of frontier nodes
    the first call yield an iterator contains the node itself
    each subsequent call yields an iterator that is composed of children's gen
    try line chains children's yields, and thus accumulates frontier nodes
    By recursion, the root node's bfs_gen yields all frontier nodes on the graph
    frontier is maintained on callstack, by the suspension feature of generators
    '''
    try:
        yield iter((node.value, ))
        # :( still needs a list to maintain generators
        children = [bfs_gen(n) for n in node.children]
    except AttributeError:
        raise StopIteration

    chain = 0xDEADBABE
    while chain != ():
        chain = ()
        for child in children:
            try:
                chain = (e for it in (chain, next(child)) for e in it)
            except StopIteration:
                continue
        yield chain

def bfs(root, pred):
    '''
    make generator on root node
    call next method to yield frontier nodes
    apply predicate function on it, if any matches, return True
    else continues to yield more frontier
    '''
    gens = bfs_gen(root)
    try:
        while not any(pred(v) for v in next(gens)):
            continue
        return True
    except StopIteration:
        return False

def main():
    '''
    3 4 5 6
     1   2
       0
    '''
    from collections import namedtuple

    node = namedtuple('Node', ['value', 'children'])
    six   = node(6, [None])
    five  = node(5, [None])
    four  = node(4, [None])
    three = node(3, [None])
    two   = node(2, [five, six])
    one   = node(1, [three, four])
    zero  = node(0, [one, two])

    def demo(value):
        '''dumb log'''
        print(value)
        return False

    # bfs(zero, demo)
    for i in bfs_gen(zero):
        demo(i.value)

if __name__ == '__main__':
    main()

Wow, much ugly, very obscure. But a old post on certain unsung site gives me an interesting solution.

def bfs(root):
    yield root
    for node in bfs(root):
        for child in node.children:
            yield label(child)

Tada, done! Let’s peruse this piece of code.
We want a generator that yields a sequence of nodes, this can be done by induction.

  1. Given the root of a tree, clearly the first node of the sequence is the root
  2. Because the next level of frontier nodes in the sequence is the children of the current level frontier nodes, construct a new sequence that lags behind.
  3. yield each child given its parent.

Doing this recursively makes a BFS generator.

The trick part is the second step, for a normal function, this will make infinite loops. But generator is expanded dynamically, execution jumps out of BFS, eschewing looping condition. Suspension of code also implicitly stores searching states, where calling stacks morph into a Queue. Of course, visited nodes also need to be maintained in a list(not checked here). Termination check is absent, as well.

Interestingly, adapting BFS into DFS only requires swapping two lines, just as substituting stack for queue in classical implementation.

def dfs(root):
    yield root
    for node in node.children:
        for child in dfs(node):
            yield child

Whatever fun generators have brought, BFS/DFS should probably never done like this. Storing information in stack does not save space. Static language like C/C++ lacks such generator feature. And, notably, function overhead and slow generator in most script language make developers repugnant to such winding trick.

Request is a good lib

source: pixiv

Web scrapping is a common task for script languages like python.
Yet Python standard libraries provide twisted utility to couple with this simple task. urllib[23$] ruins my python newbie days.
To achieve cookie support, you have to import cookielib, create a new cookie jar via the factory method, subclass a opener from the urllib, and finally use the forged opener to make a request.

Bloated boilerplate code does not seem pythonic at all. We want a simple scrapping utility that provides concise API like http verbs and, preferably, automatic session management. Request is the library comes to rescue.

The structure of Request:

  • model
  • session
  • adapter
  • api
  • etc…

model irons response and request parameters into unified objects. Both Request and Response supports generator style and file style. Magic methods empowers Request pythonic syntax. The most dirty works(say, encoding stuffs) lies here.

session acts as controller in MVC design. It combines adapter, which sends request, authentication and cookie session. Request’s streamlined API stems from an interface mocking real browsers.

adapter is a wrapper of urllib3, supporting connection pool management, keep-alive request and proxies.

api is just an alias for 1. open session 2. prepare request 3. send request. And all other modules are helpers that do stuffs like encoding and making auth token.

The design philosophy under Request is : Simple is better than functionality, however, it still grants users plenty of features.
Specific as the library aims to be, the philosophy underlying it applies much wider. jQuery’s almighty $, Chrome’s omnipotent omnibox, Google’s preeminent search engine and etc. Simplicity rocks, Usability counts.

Vimscript the terrible way

source: pixiv

Exercises: Drink a beer to console yourself about Vim’s coercion of strings to integers.
– Steve Losh, Learning Vimscript the hard way

Vim is a superb and classy text-editor, but VimScript, well, is far from a usable scripting language.

Vimscript is extremely powerful, but has grown organically over the years into a twisty maze ready to ensnare the unwary who enter it.
Options and commands are often terse and hard to read

On the surface, VimL looks like an agglomeration of Lua, Perl and primitive Vim configuration language. A litany of reserved words and special operators, along with Regex line-noises, gives rise to yet another write-only language. VimL supports both function call and primitive command, which seasons VimL with one more layer of complexity and inconsistency.

This is culpable, but is not the most irremediable obfuscation of VimL. The tittle should belong to event system. Vim executes one and only one handler for an event once a time. To bind multiple handlers to one event, users have to write their own custom function.

It seems very easy for anyone, but at field, handling Vim plugin conflicts is a Herculean task. Vim comes out with many plugins loaded, and users’ .vimrc are liable to frequent editing. Maintaining a list of event handlers requires a user to face with plugins’ source code(verbose namespace, illegible syntax, Byzantine structure, or line noises fraught with all the aforementioned three). And this will be users’ chronic concern since Vim’s configuration are constantly changing. What’s more, VimL’s events are not sufficient. Until recent that Vim 7.3 adds InsertPre event, there is no event for pressing arbitrary key. Users have to write recursive function to capture all keys

Lastly, there is little VimL material users can find. Vim’s help doc is the only comprehensive and up-to-date document, but it is awkwardly organized. Vim community also favors Python over VimL, indicating that VImL will have less chance to get enhancement. Over the internet, few complaints and swearings about VimL are heard. It shall be interpreted as few VimL developers instead of few VimL defects.

dark
sans