Hi all,
I'm developing an TCP socket SDK in C. The SDK is using Apple Network Framework and encountered some wired bad access issue occasionally on function nw_connection_send
.
Looking into the trace stack, it was bad access issue in nw_write_request_create
, when it is trying to release a reference. However, I could not found more doc/source code details about nw_write_request_create
.
// on socket destroy, we will release the related nw_connection.
increase_ref_count(socket)
nw_connection_t nw_connection = socket->nw_connection;
dispatch_data_t data = dispatch_data_create(message_ptr->ptr, message_ptr->len, dispath_event_loop, DISPATCH_DATA_DESTRUCTOR_FREE);
// > Bad Access here <
// While I check `nw_connection` and `data`, both seems available while the function get called. I tried to call dispatch_retain on `data`, but it was not helpful.
nw_connection_send( nw_connection, data, NW_CONNECTION_DEFAULT_MESSAGE_CONTEXT, false, ^(nw_error_t error) {
// process the message, we will release message_buf in this function.
completed_fn(message_buf);
reduce_ref_count(socket)
}
While I check nw_connection
and data
, both seems available while the function get called. I tried to call dispatch_retain on data
, but it was not helpful. Is there any way to narrow down which object is releasing?
As the issue happened occasionally (9 failure out of 10 attempts when I run multiple unit tests at the same time, and I rarely see it when I ran a single unit test).
I would assume it was actually a race condition here. Is there a way to track down which object is released?
I do understand it would be hard to track without knowing more design details of my SDK, but any related suggestions or ideas would be appreciated. Thanks in advance.
More related source code:
struct nw_socket{
nw_connection_t nw_connection;
nw_parameters_t socket_options_to_params;
dispatch_queue_t event_loop;
// ... bunch of other parameters...
struct ref_count ref_count;
}
static int s_socket_connect_fn(
const struct socket_endpoint *remote_endpoint,
struct dispatch_queue_t event_loop)
{
nw_socket = /*new socket memory allocation, increasing ref count*/
nw_endpoint_t endpoint = nw_endpoint_create_address(/* process remote_endpoint */);
nw_socket->nw_connection = nw_connection_create(endpoint, nw_socket >socket_options_to_params);
nw_release(endpoint);
nw_socket->nw_connection->set_queue(nw_socket->nw_connection, event_loop);
nw_socket->event_loop = event_loop;
nw_connection_set_state_changed_handler(nw_socket->nw_connection, ^(nw_connection_state_t state, nw_error_t error) {
// setup connection handler
}
nw_connection_start(nw_socket->nw_connection);
nw_retain(nw_socket->nw_connection);
}
// nw_socket is ref counted, call the destroy function on ref_count reduced to 0
static void s_socket_impl_destroy(void *sock_ptr) {
struct nw_socket *nw_socket = sock_ptr;
/* Network Framework cleanup */
if (nw_socket->socket_options_to_params) {
nw_release(nw_socket->socket_options_to_params);
nw_socket->socket_options_to_params = NULL;
}
if (nw_socket->nw_connection) {
nw_release(nw_socket->nw_connection);
// Print here, to make sure the nw_connection was not released before nw_connection_send call.
nw_socket->nw_connection = NULL;
}
// releasing memory and other parameters
}
static int s_socket_write_fn(
struct nw_socket *socket,
const struct bytePtr* message_ptr, // message_ptr is a pointer to allocated message_buf
socket_on_write_completed_fn *completed_fn,
void *message_buf) {
// Ideally nw_connection would not be released, as socket ref_count is retained here.
increase_ref_count(socket->ref_count);
nw_connection_t nw_connection = socket->nw_connection;
struct dispatch_queue_t dispatch_event_loop = socket->event_loop;
dispatch_data_t data = dispatch_data_create(message_ptr->ptr, message_ptr->len, dispath_event_loop, DISPATCH_DATA_DESTRUCTOR_FREE);
// > Bad Access here <
// While I check `nw_connection` and `data`, both seems available while the function get called. I tried to call dispatch_retain on `data`, but it is not helpful.
nw_connection_send( nw_connection, data, NW_CONNECTION_DEFAULT_MESSAGE_CONTEXT, false, ^(nw_error_t error) {
// process the message, we will release message_buf in this function.
completed_fn(message_buf);
reduce_ref_count(socket)
}
}
To start, I recommend you run your test under the standard memory debugging tools. I strongly suspect you have a memory management issue here.
Regarding this line:
dispatch_data_t data = dispatch_data_create(message_ptr->ptr, message_ptr->len, dispath_event_loop, DISPATCH_DATA_DESTRUCTOR_FREE);
This means that:
-
Dispatch will create a data object with a no-copy reference to the buffer described by
message_ptr->ptr
andmessage_ptr->len
. -
When the data object is deallocated — that is, when its ref count hits zero — Dispatch will free that buffer by calling
free
.
Is that what you’re expecting to happen? Because it doesn’t really gel with the comment in this code:
// process the message, we will release message_buf in this function.
completed_fn(message_buf);
Share and Enjoy
—
Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"