\r\nscenario, specially in single-feed videos recorded in tight courts,

\r\nwhere cluttering and occlusions cannot be avoided. This paper

\r\npresents an analysis of several geometric and semantic visual features

\r\nto detect and track basketball players. An ablation study is carried

\r\nout and then used to remark that a robust tracker can be built with

\r\nDeep Learning features, without the need of extracting contextual

\r\nones, such as proximity or color similarity, nor applying camera

\r\nstabilization techniques. The presented tracker consists of: (1) a

\r\ndetection step, which uses a pretrained deep learning model to

\r\nestimate the players pose, followed by (2) a tracking step, which

\r\nleverages pose and semantic information from the output of a

\r\nconvolutional layer in a VGG network. Its performance is analyzed

\r\nin terms of MOTA over a basketball dataset with more than 10k

