Producing a 3D voxel from a single view by
deep learning-based methods has garnered increasing attention.
Several state-of-the-art works introduce the recurrent neural
network(RNN) to fuse features and generate full volumetric
occupancy. However, the inputs are unable to be fully exploited to
improve the reconstruction due to long-term memory loss. And
most of the works have considered using 3D supervision for the
whole optimization to recover the full volume, but lack detailed
silhouette supervision to refine the reconstruction process. To
address these issues, an end-to-end object reconstruction network
with scaling volume-view supervision is proposed. We introduce
an auto-encoder 3D volume predicting network that takes a
single arbitrary image as input and outputs a voxel occupancy
grid. And a scaling volume-view supervision module, which
uses up-sampling to zoom errors and increase penalties, is
leveraged to improve both the global and local optimization.
Extensive experimental analysis on ShapeNet dataset shows that
our network has superior performance when the scaling volumeview supervision is involved and the deep residual module boosts
the reconstruction performance and speeds up the optimization
effectively
修改评论