Edge intelligence is an emerging technology that integrates edge computing and deep learning to bring AI to the network’s edge. It has gained wide attention for its lower network latency and better privacy preservation abilities. However, the inference of deep neural networks is computationally demanding and results in poor real-time performance, making it challenging for resource-constrained edge devices. In this paper, we propose a hierarchical deep learning model based on TreeNet to reduce the computational cost for edge devices. Based on the similarity of the classification categories, we decompose a given task into disjoint sub-tasks to reduce the complexity of the required model. Then a lightweight binary classifier is proposed for evaluating the sub-task inference result. If the inference result of a sub-task is unreliable, our system will forward the input samples to the cloud server for further processing. We also proposed a new strategy for finding and sharing common features across sub-tasks to improve training speed and accuracy. The experimental results on several popular datasets demonstrate the effectiveness of our approach in speeding up inferences while processing most of the input data with a low error rate.