Previous approaches for intent classification and slot filling have used both (1) separate models for slot filling, together with help vector machines (Moschitti et al., 2007), conditional random fields (Xu and Sarikaya, 2014), and recurrent neural networks of varied types (Kurata et al., 2016) or (2) joint fashions that diverge into separate decoders or layers…