Clustering in sensor networks provides energy conservation, network scalability, topology stability, reducing overhead and also allows data aggregation and cooperation in data sensing and processing. Wireless Multimedia Sensor Networks are characterized for directional sensing, the Field of View (FoV), in contrast to scalar sensors in which the sensing area usually is more uniform. In this paper, we first group multimedia sensor nodes in clusters with a novel cluster formation approach that associates nodes based on their common sensing area. The proposed cluster formation algorithm, called Multi-Cluster Membership (MCM), establishes clusters with nodes that their FoVs overlap at least in a minimum threshold area. The name of Multi-Cluster Membership comes from the fact that a node may belong to multiple clusters, if its FoV intersects more than one cluster-head and satisfies the threshold area. Comparing with Single-Cluster Membership (SCM) schemes, in which each node belongs to exactly one cluster, because of the capability of coordination between intersected clusters, MCM is more efficient in terms of energy conservation in sensing and processing subsystems at the cost of adding complexity in the node/cluster coordination. The main imposed difficulty by MCM, is the coordination of nodes and clusters for collaborative monitoring; SCMs usually assign tasks in a round-robin manner. Then, as second contribution, we define a node selection and scheduling algorithm for monitoring the environment that introduces intra and inter-cluster coordination and collaboration, showing how the network lifetime is prolonged with high lifetime prolongation factors particularly in dense deployments.